Standardizing and Scaffolding Healthcare AI-Chatbot Evaluation

medRxiv - Health Policy Pub Date : 2024-07-21 DOI:10.1101/2024.07.21.24310774

Yining Hua, Winna Xia, David W. Bates, Luke Hartstein, Hyungjin Tom Kim, Michael Lingzhi Li, Benjamin W Nelson, Charles Stromeyer, Darlene King, Jina Suh, Li Zhou, John Torous

引用次数: 0

Abstract

The rapid rise of healthcare chatbots, valued at $787.1 million in 2022 and projected to grow at 23.9% annually through 2030, underscores the need for robust evaluation frameworks. Despite their potential, the absence of standardized evaluation criteria and rapid AI advancements complicate assessments. This study addresses these challenges by developing a the first comprehensive evaluation framework inspired by health app regulations and integrating insights from diverse stakeholders. Following PRISMA guidelines, we reviewed 11 existing frameworks, refining 271 questions into a structured framework encompassing three priority constructs, 18 second-level constructs, and 60 third-level constructs. Our framework emphasizes safety, privacy, trustworthiness, and usefulness, aligning with recent concerns about AI in healthcare. This adaptable framework aims to serve as the initial step in facilitating the responsible integration of chatbots into healthcare settings.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

医疗保健人工智能聊天机器人评估的标准化和脚手架化

2022 年，医疗聊天机器人的价值为 7.871 亿美元，预计到 2030 年将以每年 23.9% 的速度增长。尽管医疗机器人潜力巨大，但标准化评估标准的缺失和人工智能的快速发展使评估变得复杂。本研究从健康应用法规中汲取灵感，结合不同利益相关者的见解，制定了首个综合评估框架，以应对这些挑战。根据 PRISMA 指南，我们审查了 11 个现有框架，将 271 个问题细化为一个结构化框架，其中包括 3 个优先结构、18 个二级结构和 60 个三级结构。我们的框架强调安全性、隐私性、可信性和实用性，这与最近人们对医疗保健领域人工智能的关注是一致的。这个可调整的框架旨在作为促进聊天机器人负责任地融入医疗环境的第一步。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

medRxiv - Health Policy

自引率

0.00%

发文量