Evaluating the interactions of Medical Doctors with chatbots based on large language models: Insights from a nationwide study in the Greek healthcare sector using ChatGPT
Loukas Triantafyllopoulos , Georgios Feretzakis , Lazaros Tzelves , Aikaterini Sakagianni , Vassilios S. Verykios , Dimitris Kalles
{"title":"Evaluating the interactions of Medical Doctors with chatbots based on large language models: Insights from a nationwide study in the Greek healthcare sector using ChatGPT","authors":"Loukas Triantafyllopoulos , Georgios Feretzakis , Lazaros Tzelves , Aikaterini Sakagianni , Vassilios S. Verykios , Dimitris Kalles","doi":"10.1016/j.chb.2024.108404","DOIUrl":null,"url":null,"abstract":"<div><p>In this AI-focused era, researchers are delving into AI applications in healthcare, with ChatGPT being a primary focus. This Greek study involved 182 doctors from various regions, utilizing a custom web application connected to ChatGPT 4.0. Doctors from diverse departments and experience levels engaged with ChatGPT, which provided tailored responses. Over a month, data was collected using a form with a 1-to-5 rating scale. The results showed varying satisfaction levels across four criteria: clarity, response time, accuracy, and overall satisfaction. ChatGPT's response speed received high ratings (3.85/5.0), whereas clarity of information was moderately rated (3.43/5.0). A significant observation was the correlation between a doctor's experience and their satisfaction with ChatGPT. More experienced doctors (over 21 years) reported lower satisfaction (2.80–3.74/5.0) compared to their less experienced counterparts (3.43–4.20/5.0). At the medical field level, Internal Medicine showed higher satisfaction in evaluation criteria (ranging from 3.56 to 3.88), compared to other fields, while Psychiatry scored higher overall, with ratings from 3.63 to 5.00. The study also compared two departments: Urology and Internal Medicine, with the latter being more satisfied with the accuracy, and clarity of provided information, response time, and overall compared to Urology. These findings illuminate the specific needs of the health sector and highlight both the potential and areas for improvement in ChatGPT's provision of specialized medical information. Despite current limitations, ChatGPT, in its present version, offers a valuable resource to the medical community, signaling further advancements and potential integration into healthcare practices.</p></div>","PeriodicalId":48471,"journal":{"name":"Computers in Human Behavior","volume":"161 ","pages":"Article 108404"},"PeriodicalIF":9.0000,"publicationDate":"2024-08-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers in Human Behavior","FirstCategoryId":"102","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0747563224002723","RegionNum":1,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PSYCHOLOGY, EXPERIMENTAL","Score":null,"Total":0}
引用次数: 0
Abstract
In this AI-focused era, researchers are delving into AI applications in healthcare, with ChatGPT being a primary focus. This Greek study involved 182 doctors from various regions, utilizing a custom web application connected to ChatGPT 4.0. Doctors from diverse departments and experience levels engaged with ChatGPT, which provided tailored responses. Over a month, data was collected using a form with a 1-to-5 rating scale. The results showed varying satisfaction levels across four criteria: clarity, response time, accuracy, and overall satisfaction. ChatGPT's response speed received high ratings (3.85/5.0), whereas clarity of information was moderately rated (3.43/5.0). A significant observation was the correlation between a doctor's experience and their satisfaction with ChatGPT. More experienced doctors (over 21 years) reported lower satisfaction (2.80–3.74/5.0) compared to their less experienced counterparts (3.43–4.20/5.0). At the medical field level, Internal Medicine showed higher satisfaction in evaluation criteria (ranging from 3.56 to 3.88), compared to other fields, while Psychiatry scored higher overall, with ratings from 3.63 to 5.00. The study also compared two departments: Urology and Internal Medicine, with the latter being more satisfied with the accuracy, and clarity of provided information, response time, and overall compared to Urology. These findings illuminate the specific needs of the health sector and highlight both the potential and areas for improvement in ChatGPT's provision of specialized medical information. Despite current limitations, ChatGPT, in its present version, offers a valuable resource to the medical community, signaling further advancements and potential integration into healthcare practices.
期刊介绍:
Computers in Human Behavior is a scholarly journal that explores the psychological aspects of computer use. It covers original theoretical works, research reports, literature reviews, and software and book reviews. The journal examines both the use of computers in psychology, psychiatry, and related fields, and the psychological impact of computer use on individuals, groups, and society. Articles discuss topics such as professional practice, training, research, human development, learning, cognition, personality, and social interactions. It focuses on human interactions with computers, considering the computer as a medium through which human behaviors are shaped and expressed. Professionals interested in the psychological aspects of computer use will find this journal valuable, even with limited knowledge of computers.