Generative AI chatbots for reliable cancer information: Evaluating web-search, multilingual, and reference capabilities of emerging large language models

IF 7.6 1区医学 Q1 ONCOLOGY European Journal of Cancer Pub Date : 2025-02-03 DOI:10.1016/j.ejca.2025.115274

Bradley D. Menz , Natansh D. Modi , Ahmad Y. Abuhelwa , Warit Ruanglertboon , Agnes Vitry , Yuan Gao , Lee X. Li , Rakchha Chhetri , Bianca Chu , Stephen Bacchi , Ganessan Kichenadasse , Adel Shahnam , Andrew Rowland , Michael J. Sorich , Ashley M. Hopkins

{"title":"Generative AI chatbots for reliable cancer information: Evaluating web-search, multilingual, and reference capabilities of emerging large language models","authors":"Bradley D. Menz , Natansh D. Modi , Ahmad Y. Abuhelwa , Warit Ruanglertboon , Agnes Vitry , Yuan Gao , Lee X. Li , Rakchha Chhetri , Bianca Chu , Stephen Bacchi , Ganessan Kichenadasse , Adel Shahnam , Andrew Rowland , Michael J. Sorich , Ashley M. Hopkins","doi":"10.1016/j.ejca.2025.115274","DOIUrl":null,"url":null,"abstract":"<div><div>Recent advancements in large language models (LLMs) enable real-time web search, improved referencing, and multilingual support, yet ensuring they provide safe health information remains crucial. This perspective evaluates seven publicly accessible LLMs—ChatGPT, Co-Pilot, Gemini, MetaAI, Claude, Grok, Perplexity—on three simple cancer-related queries across eight languages (336 responses: English, French, Chinese, Thai, Hindi, Nepali, Vietnamese, and Arabic). None of the 42 English responses contained clinically meaningful hallucinations, whereas 7 of 294 non-English responses did. 48 % (162/336) of responses included valid references, but 39 % of the English references were.com links reflecting quality concerns. English responses frequently exceeded an eighth-grade level, and many non-English outputs were also complex. These findings reflect substantial progress over the past 2-years but reveal persistent gaps in multilingual accuracy, reliable reference inclusion, referral practices, and readability. Ongoing benchmarking is essential to ensure LLMs safely support global health information dichotomy and meet online information standards.</div></div>","PeriodicalId":11980,"journal":{"name":"European Journal of Cancer","volume":"218 ","pages":"Article 115274"},"PeriodicalIF":7.6000,"publicationDate":"2025-02-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"European Journal of Cancer","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0959804925000553","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ONCOLOGY","Score":null,"Total":0}

引用次数: 0

Abstract

Recent advancements in large language models (LLMs) enable real-time web search, improved referencing, and multilingual support, yet ensuring they provide safe health information remains crucial. This perspective evaluates seven publicly accessible LLMs—ChatGPT, Co-Pilot, Gemini, MetaAI, Claude, Grok, Perplexity—on three simple cancer-related queries across eight languages (336 responses: English, French, Chinese, Thai, Hindi, Nepali, Vietnamese, and Arabic). None of the 42 English responses contained clinically meaningful hallucinations, whereas 7 of 294 non-English responses did. 48 % (162/336) of responses included valid references, but 39 % of the English references were.com links reflecting quality concerns. English responses frequently exceeded an eighth-grade level, and many non-English outputs were also complex. These findings reflect substantial progress over the past 2-years but reveal persistent gaps in multilingual accuracy, reliable reference inclusion, referral practices, and readability. Ongoing benchmarking is essential to ensure LLMs safely support global health information dichotomy and meet online information standards.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

求助全文

约1分钟内获得全文去求助

来源期刊

European Journal of Cancer 医学-肿瘤学

CiteScore

11.50

自引率

4.80%

发文量

953

审稿时长

23 days

期刊介绍： The European Journal of Cancer (EJC) serves as a comprehensive platform integrating preclinical, digital, translational, and clinical research across the spectrum of cancer. From epidemiology, carcinogenesis, and biology to groundbreaking innovations in cancer treatment and patient care, the journal covers a wide array of topics. We publish original research, reviews, previews, editorial comments, and correspondence, fostering dialogue and advancement in the fight against cancer. Join us in our mission to drive progress and improve outcomes in cancer research and patient care.