Runa Bhaumik, Abhishikta Roy, Vineet Srivastava, Lokesh Boggavarapu, Ranganathan Chandrasekaran, Edward K Mensah, John Galvin
Background: Recent advances in large language models (LLMs) such as GPT-4o offer a transformative opportunity to extract nuanced linguistic, emotional, and social features from medical crowdfunding campaign texts at scale. These models enable a deeper understanding of the factors influencing campaign success far beyond what structured data alone can reveal. Given these advancements, there is a pressing need for an integrated modeling framework that leverages both LLM-derived features and machine learning algorithms to more accurately predict and explain success in medical crowdfunding.
Objective: This study addressed the gap of failure to capture the deeper psychosocial and clinical nuances that influence campaign success. It leveraged cutting-edge machine learning techniques alongside state-of-the-art LLMs such as GPT-4o to automatically generate and extract nuanced linguistic, social, and clinical features from campaign narratives. By combining these features with ensemble learning approaches, the proposed methodology offers a novel and more comprehensive strategy for understanding and predicting crowdfunding success in the medical domain.
Methods: We used GPT-4o to extract linguistic and social determinants of health features from cancer crowdfunding campaign narratives. A random forest model with permutation importance was applied to rank features based on their contribution to predicting campaign success. Four machine learning algorithms-random forest, gradient boosting, logistic regression, and elastic net-were evaluated using stratified 10-fold cross-validation, with performance measured through accuracy, sensitivity, and specificity.
Results: Gradient boosting consistently outperformed the other algorithms in terms of sensitivity (consistently 0.786 to 0.798), indicating its superior ability to identify successful crowdfunding campaigns using linguistic and social determinants of health features. The permutation importance score revealed that for severe medical conditions, income loss, chemotherapy treatment, clear and effective communication, cognitive understanding, family involvement, empathy, and social behaviors play an important role in the success of campaigns.
Conclusions: This study demonstrates that LLMs such as GPT-4o can effectively extract nuanced linguistic and social features from crowdfunding narratives, offering deeper insights than traditional methods. These features, when combined with machine learning, significantly improve the identification of key predictors of campaign success, such as medical severity, financial hardship, and empathetic communication. Our findings underscore the potential of LLMs to enhance predictive modeling in health-related crowdfunding and support more targeted policy and communication strategies to reduce financial vulnerability among patients with cancer.
{"title":"Leveraging Large Language Models and Machine Learning for Success Analysis in Robust Cancer Crowdfunding Predictions: Quantitative Study.","authors":"Runa Bhaumik, Abhishikta Roy, Vineet Srivastava, Lokesh Boggavarapu, Ranganathan Chandrasekaran, Edward K Mensah, John Galvin","doi":"10.2196/73448","DOIUrl":"10.2196/73448","url":null,"abstract":"<p><strong>Background: </strong>Recent advances in large language models (LLMs) such as GPT-4o offer a transformative opportunity to extract nuanced linguistic, emotional, and social features from medical crowdfunding campaign texts at scale. These models enable a deeper understanding of the factors influencing campaign success far beyond what structured data alone can reveal. Given these advancements, there is a pressing need for an integrated modeling framework that leverages both LLM-derived features and machine learning algorithms to more accurately predict and explain success in medical crowdfunding.</p><p><strong>Objective: </strong>This study addressed the gap of failure to capture the deeper psychosocial and clinical nuances that influence campaign success. It leveraged cutting-edge machine learning techniques alongside state-of-the-art LLMs such as GPT-4o to automatically generate and extract nuanced linguistic, social, and clinical features from campaign narratives. By combining these features with ensemble learning approaches, the proposed methodology offers a novel and more comprehensive strategy for understanding and predicting crowdfunding success in the medical domain.</p><p><strong>Methods: </strong>We used GPT-4o to extract linguistic and social determinants of health features from cancer crowdfunding campaign narratives. A random forest model with permutation importance was applied to rank features based on their contribution to predicting campaign success. Four machine learning algorithms-random forest, gradient boosting, logistic regression, and elastic net-were evaluated using stratified 10-fold cross-validation, with performance measured through accuracy, sensitivity, and specificity.</p><p><strong>Results: </strong>Gradient boosting consistently outperformed the other algorithms in terms of sensitivity (consistently 0.786 to 0.798), indicating its superior ability to identify successful crowdfunding campaigns using linguistic and social determinants of health features. The permutation importance score revealed that for severe medical conditions, income loss, chemotherapy treatment, clear and effective communication, cognitive understanding, family involvement, empathy, and social behaviors play an important role in the success of campaigns.</p><p><strong>Conclusions: </strong>This study demonstrates that LLMs such as GPT-4o can effectively extract nuanced linguistic and social features from crowdfunding narratives, offering deeper insights than traditional methods. These features, when combined with machine learning, significantly improve the identification of key predictors of campaign success, such as medical severity, financial hardship, and empathetic communication. Our findings underscore the potential of LLMs to enhance predictive modeling in health-related crowdfunding and support more targeted policy and communication strategies to reduce financial vulnerability among patients with cancer.</p>","PeriodicalId":73551,"journal":{"name":"JMIR AI","volume":"4 ","pages":"e73448"},"PeriodicalIF":2.0,"publicationDate":"2025-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12629620/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145558576","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Guoyong Wang, Ye Zhang, Weixin Wang, Yingjie Zhu, Wei Lu, Chaonan Wang, Hui Bi, Xiaonan Yang
Background: This study examines the capability of large language models (LLMs) in detecting medical rumors, using hemangioma-related information as an example. It compares the performances of ChatGPT-4o and DeepSeek-R1.
Objective: This study aimed to evaluate and compare the accuracy, stability, and expert-rated reliability of 2 LLMs, ChatGPT-4o and DeepSeek-R1, in classifying medical information related to hemangiomas as either "rumors" or "accurate information."
Methods: We collected 82 publicly available texts from social media platforms, medical education websites, international guidelines, and journals. Of the 82 items, 47/82 (57%) were labeled as "rumors," and 35/82 (43%) were labeled as "accurate information." Three vascular anomaly specialists with extensive clinical experience independently annotated the texts in a double-blinded manner, and disagreements were resolved by arbitration to ensure labeling reliability. Subsequently, these texts were input into ChatGPT-4o and DeepSeek-R1, with each model generating 2 rounds of results under identical instructions. Output stability was assessed using bidirectional encoder representations from transformers-based semantic similarity scores. Classification accuracy, precision, recall, and F1-score were calculated to evaluate the performance. Additionally, 2 medical experts independently rated the model outputs using a 5-point scale based on clinical guidelines. Statistical analyses included paired t tests, Wilcoxon signed-rank tests, and bootstrap resampling to compute confidence intervals.
Results: In terms of semantic stability, the similarity distributions for the 2 models largely overlapped, with no statistically significant difference observed (mean difference=-0.003, 95% CI -0.011 to 0.005; P=.30). Regarding classification performance, DeepSeek-R1 achieved higher accuracy (0.963) compared to ChatGPT-4o (0.910), and also performed better in terms of precision (0.978 vs 0.940), recall (0.957 vs 0.894), and F1-score (0.967 vs 0.916). Expert evaluations revealed that DeepSeek-R1 significantly outperformed ChatGPT-4o on both "rumor" items (mean difference=0.431; P<.001; Cohen dz=0.594) and "accurate information" items (mean difference=0.264; P=.045; Cohen dz=0.352), with a particularly pronounced advantage in rumor detection.
Conclusions: DeepSeek-R1 demonstrated greater accuracy and rationale in detecting medical rumors compared with ChatGPT-4o. This study provides empirical support for the application of LLMs and recommends optimizing accuracy and incorporating real-time verification mechanisms to mitigate the harmful impact of misleading information on patient health.
背景:本研究以血管瘤相关信息为例,检验大语言模型(LLMs)检测医学谣言的能力。比较了chatgpt - 40和DeepSeek-R1的性能。目的:本研究旨在评估和比较chatgpt - 40和DeepSeek-R1两种LLMs在将血管瘤相关医疗信息分类为“谣言”或“准确信息”方面的准确性、稳定性和专家评价的可靠性。方法:我们从社交媒体平台、医学教育网站、国际指南和期刊中收集了82篇公开的文本。在82个条目中,47/82(57%)被标记为“谣言”,35/82(43%)被标记为“准确信息”。三名具有丰富临床经验的血管异常专家以双盲方式独立注释文本,并通过仲裁解决分歧,以确保标签的可靠性。随后,将这些文本输入到chatgpt - 40和DeepSeek-R1中,每个模型在相同的指令下产生2轮结果。输出稳定性评估使用双向编码器表示基于变压器的语义相似度评分。计算分类正确率、精密度、召回率和f1评分来评价其性能。此外,2名医学专家根据临床指南使用5分制对模型输出进行独立评级。统计分析包括配对t检验、Wilcoxon符号秩检验和自举重抽样来计算置信区间。结果:在语义稳定性方面,两种模型的相似度分布基本重合,差异无统计学意义(mean difference=-0.003, 95% CI = -0.011 ~ 0.005; P= 0.30)。在分类性能方面,DeepSeek-R1的准确率(0.963)高于chatgpt - 40(0.910),在精密度(0.978 vs 0.940)、召回率(0.957 vs 0.894)和f1评分(0.967 vs 0.916)方面也有更好的表现。专家评估显示,DeepSeek-R1在两个“谣言”项上的表现显著优于chatgpt - 40(平均差值=0.431);结论:与chatgpt - 40相比,DeepSeek-R1在检测医学谣言方面表现出更高的准确性和合理性。本研究为llm的应用提供了实证支持,并建议优化准确性和纳入实时验证机制,以减轻误导性信息对患者健康的有害影响。
{"title":"Detection of Medical Misinformation in Hemangioma Patient Education: Comparative Study of ChatGPT-4o and DeepSeek-R1 Large Language Models.","authors":"Guoyong Wang, Ye Zhang, Weixin Wang, Yingjie Zhu, Wei Lu, Chaonan Wang, Hui Bi, Xiaonan Yang","doi":"10.2196/76372","DOIUrl":"10.2196/76372","url":null,"abstract":"<p><strong>Background: </strong>This study examines the capability of large language models (LLMs) in detecting medical rumors, using hemangioma-related information as an example. It compares the performances of ChatGPT-4o and DeepSeek-R1.</p><p><strong>Objective: </strong>This study aimed to evaluate and compare the accuracy, stability, and expert-rated reliability of 2 LLMs, ChatGPT-4o and DeepSeek-R1, in classifying medical information related to hemangiomas as either \"rumors\" or \"accurate information.\"</p><p><strong>Methods: </strong>We collected 82 publicly available texts from social media platforms, medical education websites, international guidelines, and journals. Of the 82 items, 47/82 (57%) were labeled as \"rumors,\" and 35/82 (43%) were labeled as \"accurate information.\" Three vascular anomaly specialists with extensive clinical experience independently annotated the texts in a double-blinded manner, and disagreements were resolved by arbitration to ensure labeling reliability. Subsequently, these texts were input into ChatGPT-4o and DeepSeek-R1, with each model generating 2 rounds of results under identical instructions. Output stability was assessed using bidirectional encoder representations from transformers-based semantic similarity scores. Classification accuracy, precision, recall, and F1-score were calculated to evaluate the performance. Additionally, 2 medical experts independently rated the model outputs using a 5-point scale based on clinical guidelines. Statistical analyses included paired t tests, Wilcoxon signed-rank tests, and bootstrap resampling to compute confidence intervals.</p><p><strong>Results: </strong>In terms of semantic stability, the similarity distributions for the 2 models largely overlapped, with no statistically significant difference observed (mean difference=-0.003, 95% CI -0.011 to 0.005; P=.30). Regarding classification performance, DeepSeek-R1 achieved higher accuracy (0.963) compared to ChatGPT-4o (0.910), and also performed better in terms of precision (0.978 vs 0.940), recall (0.957 vs 0.894), and F1-score (0.967 vs 0.916). Expert evaluations revealed that DeepSeek-R1 significantly outperformed ChatGPT-4o on both \"rumor\" items (mean difference=0.431; P<.001; Cohen dz=0.594) and \"accurate information\" items (mean difference=0.264; P=.045; Cohen dz=0.352), with a particularly pronounced advantage in rumor detection.</p><p><strong>Conclusions: </strong>DeepSeek-R1 demonstrated greater accuracy and rationale in detecting medical rumors compared with ChatGPT-4o. This study provides empirical support for the application of LLMs and recommends optimizing accuracy and incorporating real-time verification mechanisms to mitigate the harmful impact of misleading information on patient health.</p>","PeriodicalId":73551,"journal":{"name":"JMIR AI","volume":"4 ","pages":"e76372"},"PeriodicalIF":2.0,"publicationDate":"2025-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12627899/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145552318","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yining Hua, Winna Xia, David Bates, George Luke Hartstein, Hyungjin Tom Kim, Michael Li, Benjamin W Nelson, Charles Stromeyer Iv, Darlene King, Jina Suh, Li Zhou, John Torous
Background: Health care chatbots are rapidly proliferating, while generative artificial intelligence (AI) outpaces existing evaluation standards.
Objective: We aimed to develop a structured, stakeholder-informed framework to standardize evaluation of health care chatbots.
Methods: PRISMA (Preferred Reporting Items for Systematic reviews and Meta-Analyses)-guided searches across multiple databases identified 266 records; 152 were screened, 21 full texts were assessed, and 11 frameworks were included. We extracted 356 questions (refined to 271 by deduplication and relevance review), mapped items to Coalition for Health AI constructs, and organized them with iterative input from clinicians, patients, developers, epidemiologists, and policymakers.
Results: We developed the Health Care AI Chatbot Evaluation Framework (HAICEF), a hierarchical framework with 3 priority domains (safety, privacy, and fairness; trustworthiness and usefulness; and design and operational effectiveness) and 18 second-level and 60 third-level constructs covering 271 questions. Emphasis includes data provenance and harm control; Health Insurance Portability and Accountability Act/General Data Protection Regulation-aligned privacy and security; bias management; and reliability, transparency, and workflow integration. Question distribution across domains is as follows: design and operational effectiveness, 40%; trustworthiness and usefulness, 39%; and safety, privacy and fairness, 21%. The framework accommodates both patient-facing and back-office use cases.
Conclusions: HAICEF provides an adaptable scaffold for standardized evaluation and responsible implementation of health care chatbots. Planned next steps include prospective validation across settings and a Delphi consensus to extend accountability and accessibility assurances.
{"title":"Standardizing and Scaffolding Health Care AI-Chatbot Evaluation: Systematic Review.","authors":"Yining Hua, Winna Xia, David Bates, George Luke Hartstein, Hyungjin Tom Kim, Michael Li, Benjamin W Nelson, Charles Stromeyer Iv, Darlene King, Jina Suh, Li Zhou, John Torous","doi":"10.2196/69006","DOIUrl":"10.2196/69006","url":null,"abstract":"<p><strong>Background: </strong>Health care chatbots are rapidly proliferating, while generative artificial intelligence (AI) outpaces existing evaluation standards.</p><p><strong>Objective: </strong>We aimed to develop a structured, stakeholder-informed framework to standardize evaluation of health care chatbots.</p><p><strong>Methods: </strong>PRISMA (Preferred Reporting Items for Systematic reviews and Meta-Analyses)-guided searches across multiple databases identified 266 records; 152 were screened, 21 full texts were assessed, and 11 frameworks were included. We extracted 356 questions (refined to 271 by deduplication and relevance review), mapped items to Coalition for Health AI constructs, and organized them with iterative input from clinicians, patients, developers, epidemiologists, and policymakers.</p><p><strong>Results: </strong>We developed the Health Care AI Chatbot Evaluation Framework (HAICEF), a hierarchical framework with 3 priority domains (safety, privacy, and fairness; trustworthiness and usefulness; and design and operational effectiveness) and 18 second-level and 60 third-level constructs covering 271 questions. Emphasis includes data provenance and harm control; Health Insurance Portability and Accountability Act/General Data Protection Regulation-aligned privacy and security; bias management; and reliability, transparency, and workflow integration. Question distribution across domains is as follows: design and operational effectiveness, 40%; trustworthiness and usefulness, 39%; and safety, privacy and fairness, 21%. The framework accommodates both patient-facing and back-office use cases.</p><p><strong>Conclusions: </strong>HAICEF provides an adaptable scaffold for standardized evaluation and responsible implementation of health care chatbots. Planned next steps include prospective validation across settings and a Delphi consensus to extend accountability and accessibility assurances.</p>","PeriodicalId":73551,"journal":{"name":"JMIR AI","volume":"4 ","pages":"e69006"},"PeriodicalIF":2.0,"publicationDate":"2025-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12639340/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145472290","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Background: Artificial intelligence (AI) is a rapidly evolving technology with the potential to revolutionize the health care industry. In Saudi Arabia, the health care sector has adopted AI technologies over the past decade to enhance service efficiency and quality, aligning with the country's technological thrust under the Saudi Vision 2030 program.
Objective: This review aims to systematically examine the impact of AI on health care quality in Saudi Arabian hospitals.
Methods: A meticulous and comprehensive systematic literature review was undertaken to identify studies investigating AI's impact on health care in Saudi Arabia. We collected several studies from selected databases, including PubMed, Google Scholar, and Saudi Digital Library. The search terms used were "Artificial Intelligence," "health care," "health care quality," "AI in Saudi Arabia," "AI in health care," and "health care providers." The review focused on studies published in the past 10 years, ensuring the inclusion of the most recent and relevant research on the effects of AI on Saudi Arabian health care organizations. The review included quantitative and qualitative analyses, providing a robust and comprehensive understanding of the topic.
Results: A systematic review of 12 studies explored AI's influence on health care services in Saudi Arabia, highlighting notable advancements in diagnostic accuracy, patient management, and operational efficiency. AI-driven models demonstrate high precision in disease prediction and early diagnosis, while machine learning optimizes telehealth, electronic health record compliance, and workflow efficiency, despite adoption challenges like connectivity limitations. Additionally, AI strengthens data security, reduces costs, and facilitates personalized treatment, ultimately enhancing health care delivery.
Conclusions: The review underscores that AI technologies have significantly improved diagnostic accuracy, patient management, and operational efficiency in Saudi Arabia's health care system. However, challenges such as data privacy, algorithmic bias, and robust regulations require attention to ensure successful AI integration in health care.
{"title":"AI in Health Care Service Quality: Systematic Review.","authors":"Eman Alghareeb, Najla Aljehani","doi":"10.2196/69209","DOIUrl":"10.2196/69209","url":null,"abstract":"<p><strong>Background: </strong>Artificial intelligence (AI) is a rapidly evolving technology with the potential to revolutionize the health care industry. In Saudi Arabia, the health care sector has adopted AI technologies over the past decade to enhance service efficiency and quality, aligning with the country's technological thrust under the Saudi Vision 2030 program.</p><p><strong>Objective: </strong>This review aims to systematically examine the impact of AI on health care quality in Saudi Arabian hospitals.</p><p><strong>Methods: </strong>A meticulous and comprehensive systematic literature review was undertaken to identify studies investigating AI's impact on health care in Saudi Arabia. We collected several studies from selected databases, including PubMed, Google Scholar, and Saudi Digital Library. The search terms used were \"Artificial Intelligence,\" \"health care,\" \"health care quality,\" \"AI in Saudi Arabia,\" \"AI in health care,\" and \"health care providers.\" The review focused on studies published in the past 10 years, ensuring the inclusion of the most recent and relevant research on the effects of AI on Saudi Arabian health care organizations. The review included quantitative and qualitative analyses, providing a robust and comprehensive understanding of the topic.</p><p><strong>Results: </strong>A systematic review of 12 studies explored AI's influence on health care services in Saudi Arabia, highlighting notable advancements in diagnostic accuracy, patient management, and operational efficiency. AI-driven models demonstrate high precision in disease prediction and early diagnosis, while machine learning optimizes telehealth, electronic health record compliance, and workflow efficiency, despite adoption challenges like connectivity limitations. Additionally, AI strengthens data security, reduces costs, and facilitates personalized treatment, ultimately enhancing health care delivery.</p><p><strong>Conclusions: </strong>The review underscores that AI technologies have significantly improved diagnostic accuracy, patient management, and operational efficiency in Saudi Arabia's health care system. However, challenges such as data privacy, algorithmic bias, and robust regulations require attention to ensure successful AI integration in health care.</p>","PeriodicalId":73551,"journal":{"name":"JMIR AI","volume":"4 ","pages":"e69209"},"PeriodicalIF":2.0,"publicationDate":"2025-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12594439/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145453806","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Belen Rivera, Stalin Canizares, Gabriel Cojuc-Konigsberg, Olena Holub, Alex Nakonechnyi, Ritah R Chumdermpadetsuk, Keren Ladin, Devin E Eckhoff, Rebecca Allen, Aditya Pawar
Background: Choosing a transplant program impacts a patient's likelihood of receiving a kidney transplant. Most patients are unaware of the factors influencing their candidacy. As patients increasingly rely on online resources for health care decisions, this study quantifies the available online patient-level information on kidney transplant recipient (KTR) selection criteria across US kidney transplant centers.
Objective: We aimed to use natural language processing and a large language model to quantify the available online patient-level information regarding the guideline-recommended KTR selection criteria reported by US transplant centers.
Methods: A cross-sectional study using natural language processing and a large language model was conducted to review the websites of US kidney transplant centers from June to August 2024. Links were explored up to 3 levels deep, and information on 31 guideline-recommended KTR selection criteria was collected from each transplant center.
Results: A total of 255 US kidney transplant centers were analyzed, comprising 10,508 web pages and 9,113,753 words. Among the kidney transplant guideline-recommended KTR selection criteria, only 2.6% (206/7905) of the information was present on the transplant center web pages. Socioeconomic and behavioral criteria were mentioned more than those related to the patient's medical conditions and comorbidities. Of the 31 criteria, finances and health insurance was the most frequently mentioned, appearing in 25.5% (65/255) of the transplant centers. Other socioeconomic and behavioral criteria, such as family and social support systems, adherence, and psychosocial assessment, were addressed in less than 4% (9/255) of the transplant centers. No information was found on any web page for 45.2% (14/31) of the KTR selection criteria. Geographically, disparities in reporting were observed, with the South Atlantic division showing the highest number of distinct criteria, while New England had the fewest.
Conclusions: Most transplant center websites do not disclose patient-level KTR selection criteria online. The lack of transparency in the evaluation and listing process for kidney transplantation may limit patients in choosing their most suitable transplant center and successfully receiving a kidney transplant.
{"title":"Examining Transparency in Kidney Transplant Recipient Selection Criteria: Nationwide Cross-Sectional Study.","authors":"Belen Rivera, Stalin Canizares, Gabriel Cojuc-Konigsberg, Olena Holub, Alex Nakonechnyi, Ritah R Chumdermpadetsuk, Keren Ladin, Devin E Eckhoff, Rebecca Allen, Aditya Pawar","doi":"10.2196/74066","DOIUrl":"10.2196/74066","url":null,"abstract":"<p><strong>Background: </strong>Choosing a transplant program impacts a patient's likelihood of receiving a kidney transplant. Most patients are unaware of the factors influencing their candidacy. As patients increasingly rely on online resources for health care decisions, this study quantifies the available online patient-level information on kidney transplant recipient (KTR) selection criteria across US kidney transplant centers.</p><p><strong>Objective: </strong>We aimed to use natural language processing and a large language model to quantify the available online patient-level information regarding the guideline-recommended KTR selection criteria reported by US transplant centers.</p><p><strong>Methods: </strong>A cross-sectional study using natural language processing and a large language model was conducted to review the websites of US kidney transplant centers from June to August 2024. Links were explored up to 3 levels deep, and information on 31 guideline-recommended KTR selection criteria was collected from each transplant center.</p><p><strong>Results: </strong>A total of 255 US kidney transplant centers were analyzed, comprising 10,508 web pages and 9,113,753 words. Among the kidney transplant guideline-recommended KTR selection criteria, only 2.6% (206/7905) of the information was present on the transplant center web pages. Socioeconomic and behavioral criteria were mentioned more than those related to the patient's medical conditions and comorbidities. Of the 31 criteria, finances and health insurance was the most frequently mentioned, appearing in 25.5% (65/255) of the transplant centers. Other socioeconomic and behavioral criteria, such as family and social support systems, adherence, and psychosocial assessment, were addressed in less than 4% (9/255) of the transplant centers. No information was found on any web page for 45.2% (14/31) of the KTR selection criteria. Geographically, disparities in reporting were observed, with the South Atlantic division showing the highest number of distinct criteria, while New England had the fewest.</p><p><strong>Conclusions: </strong>Most transplant center websites do not disclose patient-level KTR selection criteria online. The lack of transparency in the evaluation and listing process for kidney transplantation may limit patients in choosing their most suitable transplant center and successfully receiving a kidney transplant.</p>","PeriodicalId":73551,"journal":{"name":"JMIR AI","volume":" ","pages":"e74066"},"PeriodicalIF":2.0,"publicationDate":"2025-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12627972/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145115063","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Eduardo Carvalho, Miguel Mascarenhas, Francisca Pinheiro, Ricardo Correia, Sandra Balseiro, Guilherme Barbosa, Ana Guerra, Dulce Oliveira, Rita Moura, André Martins Dos Santos, Nilza Ramião
Unlabelled: The adaptive nature of artificial intelligence (AI), with its ability to improve performance through continuous learning, offers substantial benefits across various sectors. However, current regulatory frameworks are not intended to accommodate this adaptive nature, and they have prolonged approval timelines, sometimes exceeding one year for some AI-enabled devices. This creates significant challenges for manufacturers who must deal with lengthy waits and submit multiple approval requests for AI-enabled device software functions as they are updated. In response, regulatory agencies like the US Food and Drug Administration (FDA) have introduced guidelines to better support the approval process for continuously evolving AI technologies. This article explores the FDA's concept of predetermined change control plans and how they can streamline regulatory oversight by reducing the need for repeated approvals, while ensuring safety and compliance. This can help reduce the burden for regulatory bodies and decrease waiting times for approval decisions, therefore fostering innovation, increasing market uptake, and exploiting the benefits of artificial intelligence and machine learning technologies.
{"title":"Predetermined Change Control Plans: Guiding Principles for Advancing Safe, Effective, and High-Quality AI-ML Technologies.","authors":"Eduardo Carvalho, Miguel Mascarenhas, Francisca Pinheiro, Ricardo Correia, Sandra Balseiro, Guilherme Barbosa, Ana Guerra, Dulce Oliveira, Rita Moura, André Martins Dos Santos, Nilza Ramião","doi":"10.2196/76854","DOIUrl":"10.2196/76854","url":null,"abstract":"<p><strong>Unlabelled: </strong>The adaptive nature of artificial intelligence (AI), with its ability to improve performance through continuous learning, offers substantial benefits across various sectors. However, current regulatory frameworks are not intended to accommodate this adaptive nature, and they have prolonged approval timelines, sometimes exceeding one year for some AI-enabled devices. This creates significant challenges for manufacturers who must deal with lengthy waits and submit multiple approval requests for AI-enabled device software functions as they are updated. In response, regulatory agencies like the US Food and Drug Administration (FDA) have introduced guidelines to better support the approval process for continuously evolving AI technologies. This article explores the FDA's concept of predetermined change control plans and how they can streamline regulatory oversight by reducing the need for repeated approvals, while ensuring safety and compliance. This can help reduce the burden for regulatory bodies and decrease waiting times for approval decisions, therefore fostering innovation, increasing market uptake, and exploiting the benefits of artificial intelligence and machine learning technologies.</p>","PeriodicalId":73551,"journal":{"name":"JMIR AI","volume":"4 ","pages":"e76854"},"PeriodicalIF":2.0,"publicationDate":"2025-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12577744/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145423746","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Elena Giovanna Bignami, Michele Russo, Federico Semeraro, Valentina Bellini
Unlabelled: The European Union's Artificial Intelligence Act (EU AI Act), adopted in 2024, establishes a landmark regulatory framework for artificial intelligence (AI) systems, with significant implications for health care. The Act classifies medical AI as "high-risk," imposing stringent requirements for transparency, data governance, and human oversight. While these measures aim to safeguard patient safety, they may also hinder innovation, particularly for smaller health care providers and startups. Concurrently, geopolitical instability-marked by rising military expenditures, trade tensions, and supply chain disruptions-threatens health care innovation and access. This paper examines the challenges and opportunities posed by the AI Act in health care within a volatile geopolitical landscape. It evaluates the intersection of Europe's regulatory approach with competing priorities, including technological sovereignty, ethical AI, and equitable health care, while addressing unintended consequences such as reduced innovation and supply chain vulnerabilities. The study employs a comprehensive review of the EU AI Act's provisions, geopolitical trends, and their implications for health care. It analyzes regulatory documents, stakeholder statements, and case studies to assess compliance burdens, innovation barriers, and geopolitical risks. The paper also synthesizes recommendations from multidisciplinary experts to propose actionable solutions. Key findings include: (1) the AI Act's high-risk classification for medical AI could improve patient safety but risks stifling innovation due to compliance costs (eg, €29,277 annually per AI unit) and certification burdens (€16,800-23,000 per unit); (2) geopolitical factors-such as United States-China semiconductor tariffs and EU rearmament-exacerbate supply chain vulnerabilities and divert funding from health care innovation; (3) the dominance of "superstar" firms in AI development may marginalize smaller players, further concentrating innovation in well-resourced organizations; and (4) regulatory sandboxes, AI literacy programs, and international collaboration emerge as viable strategies to balance innovation and compliance. The EU AI Act provides a critical framework for ethical AI in health care, but its success depends on mitigating regulatory burdens and geopolitical risks. Proactive measures-such as multidisciplinary task forces, resilient supply chains, and human-augmented AI systems-are essential to foster innovation while ensuring patient safety. Policymakers, clinicians, and technologists must collaborate to navigate these challenges in an era of global uncertainty.
{"title":"Balancing Innovation and Control: The European Union AI Act in an Era of Global Uncertainty.","authors":"Elena Giovanna Bignami, Michele Russo, Federico Semeraro, Valentina Bellini","doi":"10.2196/75527","DOIUrl":"10.2196/75527","url":null,"abstract":"<p><strong>Unlabelled: </strong>The European Union's Artificial Intelligence Act (EU AI Act), adopted in 2024, establishes a landmark regulatory framework for artificial intelligence (AI) systems, with significant implications for health care. The Act classifies medical AI as \"high-risk,\" imposing stringent requirements for transparency, data governance, and human oversight. While these measures aim to safeguard patient safety, they may also hinder innovation, particularly for smaller health care providers and startups. Concurrently, geopolitical instability-marked by rising military expenditures, trade tensions, and supply chain disruptions-threatens health care innovation and access. This paper examines the challenges and opportunities posed by the AI Act in health care within a volatile geopolitical landscape. It evaluates the intersection of Europe's regulatory approach with competing priorities, including technological sovereignty, ethical AI, and equitable health care, while addressing unintended consequences such as reduced innovation and supply chain vulnerabilities. The study employs a comprehensive review of the EU AI Act's provisions, geopolitical trends, and their implications for health care. It analyzes regulatory documents, stakeholder statements, and case studies to assess compliance burdens, innovation barriers, and geopolitical risks. The paper also synthesizes recommendations from multidisciplinary experts to propose actionable solutions. Key findings include: (1) the AI Act's high-risk classification for medical AI could improve patient safety but risks stifling innovation due to compliance costs (eg, €29,277 annually per AI unit) and certification burdens (€16,800-23,000 per unit); (2) geopolitical factors-such as United States-China semiconductor tariffs and EU rearmament-exacerbate supply chain vulnerabilities and divert funding from health care innovation; (3) the dominance of \"superstar\" firms in AI development may marginalize smaller players, further concentrating innovation in well-resourced organizations; and (4) regulatory sandboxes, AI literacy programs, and international collaboration emerge as viable strategies to balance innovation and compliance. The EU AI Act provides a critical framework for ethical AI in health care, but its success depends on mitigating regulatory burdens and geopolitical risks. Proactive measures-such as multidisciplinary task forces, resilient supply chains, and human-augmented AI systems-are essential to foster innovation while ensuring patient safety. Policymakers, clinicians, and technologists must collaborate to navigate these challenges in an era of global uncertainty.</p>","PeriodicalId":73551,"journal":{"name":"JMIR AI","volume":"4 ","pages":"e75527"},"PeriodicalIF":2.0,"publicationDate":"2025-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12574960/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145411036","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Obinna O Oleribe, Andrew W Taylor-Robinson, Christian C Chimezie, Simon D Taylor-Robinson
Generative artificial intelligence (GenAI) is increasingly being integrated into health care, offering a wide array of benefits. Currently, GenAI applications are useful in disease risk prediction and preventive care, diagnostics via imaging, artificial intelligence (AI)-assisted devices and point-of-care tools, drug discovery and design, patient and disease monitoring, remote monitoring and wearables, integration of multimodal data and personalized medicine, on-site and remote patient and disease monitoring and device integration, robotic surgery, and health system efficiency and workflow optimization, among other aspects of disease prevention, control, diagnosis, and treatment. Recent breakthroughs have led to the development of reliable and safer GenAI systems capable of handling the complexity of health care data. The potential of GenAI to optimize resource use and enhance productivity underscores its critical role in patient care. However, the use of AI in health is not without critical gaps and challenges, including (but not limited to) AI-related environmental concerns, transparency and explainability, hallucinations, inclusiveness and inconsistencies, cost and clinical workflow integration, and safety and security of data (ETHICS). In addition, the governance and regulatory issues surrounding GenAI applications in health care highlight the importance of addressing these aspects for responsible and appropriate GenAI integration. Building on AI's promising start necessitates striking a balance between technical advancements and ethical, equity, and environmental concerns. Here, we highlight several ways in which the transformative power of GenAI is revolutionizing public health practice and patient care, acknowledge gaps and challenges, and indicate future directions for AI adoption and deployment.
{"title":"ETHICS of AI Adoption and Deployment in Health Care: Progress, Challenges, and Next Steps.","authors":"Obinna O Oleribe, Andrew W Taylor-Robinson, Christian C Chimezie, Simon D Taylor-Robinson","doi":"10.2196/67626","DOIUrl":"10.2196/67626","url":null,"abstract":"<p><p>Generative artificial intelligence (GenAI) is increasingly being integrated into health care, offering a wide array of benefits. Currently, GenAI applications are useful in disease risk prediction and preventive care, diagnostics via imaging, artificial intelligence (AI)-assisted devices and point-of-care tools, drug discovery and design, patient and disease monitoring, remote monitoring and wearables, integration of multimodal data and personalized medicine, on-site and remote patient and disease monitoring and device integration, robotic surgery, and health system efficiency and workflow optimization, among other aspects of disease prevention, control, diagnosis, and treatment. Recent breakthroughs have led to the development of reliable and safer GenAI systems capable of handling the complexity of health care data. The potential of GenAI to optimize resource use and enhance productivity underscores its critical role in patient care. However, the use of AI in health is not without critical gaps and challenges, including (but not limited to) AI-related environmental concerns, transparency and explainability, hallucinations, inclusiveness and inconsistencies, cost and clinical workflow integration, and safety and security of data (ETHICS). In addition, the governance and regulatory issues surrounding GenAI applications in health care highlight the importance of addressing these aspects for responsible and appropriate GenAI integration. Building on AI's promising start necessitates striking a balance between technical advancements and ethical, equity, and environmental concerns. Here, we highlight several ways in which the transformative power of GenAI is revolutionizing public health practice and patient care, acknowledge gaps and challenges, and indicate future directions for AI adoption and deployment.</p>","PeriodicalId":73551,"journal":{"name":"JMIR AI","volume":"4 ","pages":"e67626"},"PeriodicalIF":2.0,"publicationDate":"2025-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12616186/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145411110","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
<p><strong>Background: </strong>The widespread adoption of artificial intelligence (AI)-powered search engines has transformed how people access health information. Microsoft Copilot, formerly Bing Chat, offers real-time web-sourced responses to user queries, raising concerns about the reliability of its health content. This is particularly critical in the domain of dietary supplements, where scientific consensus is limited and online misinformation is prevalent. Despite the popularity of supplements in Japan, little is known about the accuracy of AI-generated advice on their effectiveness for common diseases.</p><p><strong>Objective: </strong>We aimed to evaluate the reliability and accuracy of Microsoft Copilot, an AI search engine, in responding to health-related queries about dietary supplements. Our findings can help consumers use large language models more safely and effectively when seeking information on dietary supplements and support developers in improving large language models' performance in this field.</p><p><strong>Methods: </strong>We simulated typical consumer behavior by posing 180 questions (6 per supplement × 30 supplements) to Copilot's 3 response modes (creative, balanced, and precise) in Japanese. These questions addressed the effectiveness of supplements in treating 6 common conditions (cancer, diabetes, obesity, constipation, joint pain, and hypertension). We classified the AI search engine's answers as "effective," "uncertain," or "ineffective" and evaluated for accuracy against evidence-based assessments conducted by licensed physicians. We conducted a qualitative content analysis of the response texts and systematically examined the types of sources cited in all responses.</p><p><strong>Results: </strong>The proportion of Copilot responses claiming supplement effectiveness was 29.4% (53/180), 47.8% (86/180), and 45% (81/180) for the creative, balanced, and precise modes, respectively, whereas overall accuracy of the responses was low across all modes: 36.1% (65/180), 31.7% (57/180), and 31.7% (57/180) for creative, balanced, and precise, respectively. No significant difference was observed among the 3 modes (P=.59). Notably, 72.7% (2240/3081) of the citations came from unverified sources such as blogs, sales websites, and social media. Of the 540 responses analyzed, 54 (10%) contained at least 1 citation in which the cited source did not include or support the claim made by Copilot, indicating hallucinated content. Only 48.5% (262/540) of the responses included a recommendation to consult health care professionals. Among disease categories, the highest accuracy was found for cancer-related questions, likely due to lower misinformation prevalence.</p><p><strong>Conclusions: </strong>This is the first study to assess Copilot's performance on dietary supplement information. Despite its authoritative appearance, Copilot frequently cited noncredible sources and provided ambiguous or inaccurate information. Its tendency to a
{"title":"Evaluating the Reliability and Accuracy of an AI-Powered Search Engine in Providing Responses on Dietary Supplements: Quantitative and Qualitative Evaluation.","authors":"Mingxin Liu, Tsuyoshi Okuhara, Ritsuko Shirabe, Yuriko Nishiie, Yinghan Xu, Hiroko Okada, Takahiro Kiuchi","doi":"10.2196/78436","DOIUrl":"10.2196/78436","url":null,"abstract":"<p><strong>Background: </strong>The widespread adoption of artificial intelligence (AI)-powered search engines has transformed how people access health information. Microsoft Copilot, formerly Bing Chat, offers real-time web-sourced responses to user queries, raising concerns about the reliability of its health content. This is particularly critical in the domain of dietary supplements, where scientific consensus is limited and online misinformation is prevalent. Despite the popularity of supplements in Japan, little is known about the accuracy of AI-generated advice on their effectiveness for common diseases.</p><p><strong>Objective: </strong>We aimed to evaluate the reliability and accuracy of Microsoft Copilot, an AI search engine, in responding to health-related queries about dietary supplements. Our findings can help consumers use large language models more safely and effectively when seeking information on dietary supplements and support developers in improving large language models' performance in this field.</p><p><strong>Methods: </strong>We simulated typical consumer behavior by posing 180 questions (6 per supplement × 30 supplements) to Copilot's 3 response modes (creative, balanced, and precise) in Japanese. These questions addressed the effectiveness of supplements in treating 6 common conditions (cancer, diabetes, obesity, constipation, joint pain, and hypertension). We classified the AI search engine's answers as \"effective,\" \"uncertain,\" or \"ineffective\" and evaluated for accuracy against evidence-based assessments conducted by licensed physicians. We conducted a qualitative content analysis of the response texts and systematically examined the types of sources cited in all responses.</p><p><strong>Results: </strong>The proportion of Copilot responses claiming supplement effectiveness was 29.4% (53/180), 47.8% (86/180), and 45% (81/180) for the creative, balanced, and precise modes, respectively, whereas overall accuracy of the responses was low across all modes: 36.1% (65/180), 31.7% (57/180), and 31.7% (57/180) for creative, balanced, and precise, respectively. No significant difference was observed among the 3 modes (P=.59). Notably, 72.7% (2240/3081) of the citations came from unverified sources such as blogs, sales websites, and social media. Of the 540 responses analyzed, 54 (10%) contained at least 1 citation in which the cited source did not include or support the claim made by Copilot, indicating hallucinated content. Only 48.5% (262/540) of the responses included a recommendation to consult health care professionals. Among disease categories, the highest accuracy was found for cancer-related questions, likely due to lower misinformation prevalence.</p><p><strong>Conclusions: </strong>This is the first study to assess Copilot's performance on dietary supplement information. Despite its authoritative appearance, Copilot frequently cited noncredible sources and provided ambiguous or inaccurate information. Its tendency to a","PeriodicalId":73551,"journal":{"name":"JMIR AI","volume":"4 ","pages":"e78436"},"PeriodicalIF":2.0,"publicationDate":"2025-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12571200/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145402750","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Julia Mary Alber, David Askay, Anuraj Dhillon, Lauren Sandoval, Sofia Ramos, Katharine Santilena
Background: Despite public health efforts, tobacco use remains the leading cause of preventable death in the United States and continues to disproportionately affect underrepresented populations. Public policies are needed to improve health equity in tobacco-related health outcomes. One strategy for promoting public support for these policies is through health messaging. Improvements in artificial intelligence (AI) technology offer new opportunities to create tailored policy messages quickly; however, there is limited research on how the public might perceive the use of AI for public health messages.
Objective: This study aimed to examine how knowledge of AI use impacts perceptions of a tobacco control policy video.
Methods: A national sample of US adults (N=500) was shown the same AI-generated video that focused on a tobacco control policy. Participants were then randomly assigned to 1 of 4 conditions where they were (1) told the narrator of the video was AI, (2) told the narrator of the video was human, (3) told it was unknown whether the narrator was AI or human, or (4) not provided any information about the narrator.
Results: Perceived video rating, effectiveness, and credibility did not significantly differ among the conditions. However, the mean speaker rating was significantly higher (P=.001) when participants were told the narrator of the health message was human (mean 3.65, SD 0.91) compared to the other conditions. Notably, positive attitudes toward AI were highest among those not provided information about the narrator; however, this difference was not statistically significant (mean 3.04, SD 0.90).
Conclusions: Results suggest that AI may impact perceptions of the speaker of a video; however, more research is needed to understand if these impacts would occur over time and after multiple exposures to content. Further qualitative research may help explain why potential differences may have occurred in speaker ratings. Public health professionals and researchers should further consider the ethics and cost-effectiveness of using AI for health messaging.
{"title":"AI Awareness and Tobacco Policy Messaging Among US Adults: Electronic Experimental Study.","authors":"Julia Mary Alber, David Askay, Anuraj Dhillon, Lauren Sandoval, Sofia Ramos, Katharine Santilena","doi":"10.2196/72987","DOIUrl":"10.2196/72987","url":null,"abstract":"<p><strong>Background: </strong>Despite public health efforts, tobacco use remains the leading cause of preventable death in the United States and continues to disproportionately affect underrepresented populations. Public policies are needed to improve health equity in tobacco-related health outcomes. One strategy for promoting public support for these policies is through health messaging. Improvements in artificial intelligence (AI) technology offer new opportunities to create tailored policy messages quickly; however, there is limited research on how the public might perceive the use of AI for public health messages.</p><p><strong>Objective: </strong>This study aimed to examine how knowledge of AI use impacts perceptions of a tobacco control policy video.</p><p><strong>Methods: </strong>A national sample of US adults (N=500) was shown the same AI-generated video that focused on a tobacco control policy. Participants were then randomly assigned to 1 of 4 conditions where they were (1) told the narrator of the video was AI, (2) told the narrator of the video was human, (3) told it was unknown whether the narrator was AI or human, or (4) not provided any information about the narrator.</p><p><strong>Results: </strong>Perceived video rating, effectiveness, and credibility did not significantly differ among the conditions. However, the mean speaker rating was significantly higher (P=.001) when participants were told the narrator of the health message was human (mean 3.65, SD 0.91) compared to the other conditions. Notably, positive attitudes toward AI were highest among those not provided information about the narrator; however, this difference was not statistically significant (mean 3.04, SD 0.90).</p><p><strong>Conclusions: </strong>Results suggest that AI may impact perceptions of the speaker of a video; however, more research is needed to understand if these impacts would occur over time and after multiple exposures to content. Further qualitative research may help explain why potential differences may have occurred in speaker ratings. Public health professionals and researchers should further consider the ethics and cost-effectiveness of using AI for health messaging.</p>","PeriodicalId":73551,"journal":{"name":"JMIR AI","volume":"4 ","pages":"e72987"},"PeriodicalIF":2.0,"publicationDate":"2025-10-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12558419/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145380050","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}