Background: Public attitudes toward health issues are becoming increasingly polarized, as seen in social media comments, which vary from supportive to oppositional and frequently include uncivil language. The combined effects of comment slant and comment tone on health behavior among a polarized public need further examination.
Objective: This study aims to examine how social media users' prior attitudes toward mask wearing and their exposure to a mask-wearing-promoting post, synchronized with polarized and hostile discussions, affect their compliance with mask wearing.
Methods: The study was a web-based survey experiment with participants recruited from Amazon Mechanical Turk. A total of 522 participants provided consent and completed the study. Participants were assigned to read a fictitious mask-wearing-promoting social media post with either civil anti-mask-wearing comments (130/522, 24.9%), civil pro-mask-wearing comments (129/522, 24.7%), uncivil anti-mask-wearing comments (131/522, 25.1%), or uncivil pro-mask-wearing comments (132/522, 25.3%). Following this, the participants were asked to complete self-assessed questionnaires. The PROCESS macro in SPSS (model 12; IBM Corp) was used to test the 3-way interaction effects between comment slant, comment tone, and prior attitudes on participants' presumed influence from the post and their behavioral intention to comply with mask-wearing.
Results: Anti-mask-wearing comments led social media users to presume less influence about others' acceptance of masks (B=1.49; P<.001; 95% CI 0.98-2.00) and resulted in decreased mask-wearing intention (B=0.07; P=.03; 95% CI 0.01-0.13). Comment tone with incivility also reduced compliance with mask-wearing (B=-0.44; P=.02; 95% CI -0.81 to -0.07). Furthermore, polarized attitudes had a direct impact (B=0.86; P<.001; 95% CI 0.45-1.26) and also interacted with both the slant and tone of comments, influencing mask-wearing intention.
Conclusions: Pro-mask-wearing comments enhanced presumed influence and compliance of mask-wearing, but incivility in the comments hindered this positive impact. Antimaskers showed increased compliance when they were unable to find civil support for their opinion in the social media environment. The findings suggest the need to correct and moderate uncivil language and misleading information in online comment sections while encouraging the posting of supportive and civil comments. In addition, information literacy programs are needed to prevent the public from being misled by polarized comments.
{"title":"The Impact of Comment Slant and Comment Tone on Digital Health Communication Among Polarized Publics: A Web-Based Survey Experiment.","authors":"Fangcao Lu, Caixie Tu","doi":"10.2196/57967","DOIUrl":"10.2196/57967","url":null,"abstract":"<p><strong>Background: </strong>Public attitudes toward health issues are becoming increasingly polarized, as seen in social media comments, which vary from supportive to oppositional and frequently include uncivil language. The combined effects of comment slant and comment tone on health behavior among a polarized public need further examination.</p><p><strong>Objective: </strong>This study aims to examine how social media users' prior attitudes toward mask wearing and their exposure to a mask-wearing-promoting post, synchronized with polarized and hostile discussions, affect their compliance with mask wearing.</p><p><strong>Methods: </strong>The study was a web-based survey experiment with participants recruited from Amazon Mechanical Turk. A total of 522 participants provided consent and completed the study. Participants were assigned to read a fictitious mask-wearing-promoting social media post with either civil anti-mask-wearing comments (130/522, 24.9%), civil pro-mask-wearing comments (129/522, 24.7%), uncivil anti-mask-wearing comments (131/522, 25.1%), or uncivil pro-mask-wearing comments (132/522, 25.3%). Following this, the participants were asked to complete self-assessed questionnaires. The PROCESS macro in SPSS (model 12; IBM Corp) was used to test the 3-way interaction effects between comment slant, comment tone, and prior attitudes on participants' presumed influence from the post and their behavioral intention to comply with mask-wearing.</p><p><strong>Results: </strong>Anti-mask-wearing comments led social media users to presume less influence about others' acceptance of masks (B=1.49; P<.001; 95% CI 0.98-2.00) and resulted in decreased mask-wearing intention (B=0.07; P=.03; 95% CI 0.01-0.13). Comment tone with incivility also reduced compliance with mask-wearing (B=-0.44; P=.02; 95% CI -0.81 to -0.07). Furthermore, polarized attitudes had a direct impact (B=0.86; P<.001; 95% CI 0.45-1.26) and also interacted with both the slant and tone of comments, influencing mask-wearing intention.</p><p><strong>Conclusions: </strong>Pro-mask-wearing comments enhanced presumed influence and compliance of mask-wearing, but incivility in the comments hindered this positive impact. Antimaskers showed increased compliance when they were unable to find civil support for their opinion in the social media environment. The findings suggest the need to correct and moderate uncivil language and misleading information in online comment sections while encouraging the posting of supportive and civil comments. In addition, information literacy programs are needed to prevent the public from being misled by polarized comments.</p>","PeriodicalId":16337,"journal":{"name":"Journal of Medical Internet Research","volume":"26 ","pages":"e57967"},"PeriodicalIF":5.8,"publicationDate":"2024-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142638823","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Background: Recent studies have identified significant gaps in equity, diversity, and inclusion (EDI) considerations within the lifecycle of artificial intelligence (AI), spanning from data collection and problem definition to implementation stages. Despite the recognized need for integrating EDI principles, there is currently no existing guideline or framework to support this integration in the AI lifecycle.
Objective: This study aimed to address this gap by identifying EDI principles and indicators to be integrated into the AI lifecycle. The goal was to develop a comprehensive guiding framework to guide the development and implementation of future AI systems.
Methods: This study was conducted in 3 phases. In phase 1, a comprehensive systematic scoping review explored how EDI principles have been integrated into AI in health and oral health care settings. In phase 2, a multidisciplinary team was established, and two 2-day, in-person international workshops with over 60 representatives from diverse backgrounds, expertise, and communities were conducted. The workshops included plenary presentations, round table discussions, and focused group discussions. In phase 3, based on the workshops' insights, the EDAI framework was developed and refined through iterative feedback from participants. The results of the initial systematic scoping review have been published separately, and this paper focuses on subsequent phases of the project, which is related to framework development.
Results: In this study, we developed the EDAI framework, a comprehensive guideline that integrates EDI principles and indicators throughout the entire AI lifecycle. This framework addresses existing gaps at various stages, from data collection to implementation, and focuses on individual, organizational, and systemic levels. Additionally, we identified both the facilitators and barriers to integrating EDI within the AI lifecycle in health and oral health care.
Conclusions: The developed EDAI framework provides a comprehensive, actionable guideline for integrating EDI principles into AI development and deployment. By facilitating the systematic incorporation of these principles, the framework supports the creation and implementation of AI systems that are not only technologically advanced but also sensitive to EDI principles.
背景:最近的研究发现,在人工智能(AI)的生命周期中,从数据收集、问题定义到实施阶段,在公平、多样性和包容性(EDI)方面的考虑存在巨大差距。尽管人们认识到需要整合 EDI 原则,但目前还没有现成的指南或框架来支持人工智能生命周期中的整合:本研究旨在通过确定应纳入人工智能生命周期的 EDI 原则和指标来弥补这一不足。目的是制定一个全面的指导框架,以指导未来人工智能系统的开发和实施:本研究分三个阶段进行。在第 1 阶段,一项全面系统的范围审查探讨了如何将电子数据交换原则纳入卫生和口腔医疗环境中的人工智能。在第 2 阶段,成立了一个多学科小组,并举办了两场为期 2 天的国际研讨会,60 多名来自不同背景、专业领域和社区的代表参加了研讨会。研讨会包括全体演讲、圆桌讨论和重点小组讨论。在第 3 阶段,根据研讨会的见解,通过与会者的反复反馈,制定并完善了 EDAI 框架。最初的系统性范围审查结果已单独发表,本文重点介绍项目的后续阶段,即与框架开发相关的阶段:在这项研究中,我们制定了 EDAI 框架,这是一个综合指南,在整个人工智能生命周期中整合了 EDI 原则和指标。该框架解决了从数据收集到实施等各个阶段的现有差距,并侧重于个人、组织和系统层面。此外,我们还确定了将 EDI 纳入健康和口腔医疗领域人工智能生命周期的促进因素和障碍:开发的 EDAI 框架为将 EDI 原则纳入人工智能开发和部署提供了全面、可行的指导。通过促进系统地纳入这些原则,该框架支持创建和实施不仅在技术上先进,而且对 EDI 原则敏感的人工智能系统。
{"title":"EDAI Framework for Integrating Equity, Diversity, and Inclusion Throughout the Lifecycle of AI to Improve Health and Oral Health Care: Qualitative Study.","authors":"Samira Abbasgholizadeh Rahimi, Richa Shrivastava, Anita Brown-Johnson, Pascale Caidor, Claire Davies, Amal Idrissi Janati, Pascaline Kengne Talla, Sreenath Madathil, Bettina M Willie, Elham Emami","doi":"10.2196/63356","DOIUrl":"10.2196/63356","url":null,"abstract":"<p><strong>Background: </strong>Recent studies have identified significant gaps in equity, diversity, and inclusion (EDI) considerations within the lifecycle of artificial intelligence (AI), spanning from data collection and problem definition to implementation stages. Despite the recognized need for integrating EDI principles, there is currently no existing guideline or framework to support this integration in the AI lifecycle.</p><p><strong>Objective: </strong>This study aimed to address this gap by identifying EDI principles and indicators to be integrated into the AI lifecycle. The goal was to develop a comprehensive guiding framework to guide the development and implementation of future AI systems.</p><p><strong>Methods: </strong>This study was conducted in 3 phases. In phase 1, a comprehensive systematic scoping review explored how EDI principles have been integrated into AI in health and oral health care settings. In phase 2, a multidisciplinary team was established, and two 2-day, in-person international workshops with over 60 representatives from diverse backgrounds, expertise, and communities were conducted. The workshops included plenary presentations, round table discussions, and focused group discussions. In phase 3, based on the workshops' insights, the EDAI framework was developed and refined through iterative feedback from participants. The results of the initial systematic scoping review have been published separately, and this paper focuses on subsequent phases of the project, which is related to framework development.</p><p><strong>Results: </strong>In this study, we developed the EDAI framework, a comprehensive guideline that integrates EDI principles and indicators throughout the entire AI lifecycle. This framework addresses existing gaps at various stages, from data collection to implementation, and focuses on individual, organizational, and systemic levels. Additionally, we identified both the facilitators and barriers to integrating EDI within the AI lifecycle in health and oral health care.</p><p><strong>Conclusions: </strong>The developed EDAI framework provides a comprehensive, actionable guideline for integrating EDI principles into AI development and deployment. By facilitating the systematic incorporation of these principles, the framework supports the creation and implementation of AI systems that are not only technologically advanced but also sensitive to EDI principles.</p>","PeriodicalId":16337,"journal":{"name":"Journal of Medical Internet Research","volume":"26 ","pages":"e63356"},"PeriodicalIF":5.8,"publicationDate":"2024-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142638729","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sarah Jackson, Dimitra Kale, Emma Beard, Olga Perski, Robert West, Jamie Brown
Background: Digital technologies offer the potential for low-cost, scalable delivery of interventions to promote smoking cessation.
Objective: We aimed to evaluate the effectiveness of the offer of Smoke Free-an evidence-informed, widely used app-for smoking cessation versus no support.
Methods: In this 2-arm randomized controlled trial, 3143 motivated adult smokers were recruited online between August 2020 and April 2021 and randomized to receive an offer of the Smoke Free app plus follow-up (intervention arm) versus follow-up only (comparator arm). Both groups were shown a brief message at the end of the baseline questionnaire encouraging them to make a quit attempt. The primary outcome was self-reported 6-month continuous abstinence assessed 7 months after randomization. Secondary outcomes included quit attempts in the first month post randomization, 3-month continuous abstinence assessed at 4 months, and 6-month continuous abstinence at 7 months among those who made a quit attempt. The primary analysis was performed on an intention-to-treat (ITT) analysis basis. Sensitivity analyses included (1) restricting the intervention group to those who took up the offer of the app, (2) using complete cases, and (3) using multiple imputation.
Results: The effective follow-up rate for 7 months was 41.9%. The primary analysis showed no evidence of a benefit of the intervention on rates of 6-month continuous abstinence (intervention 6.8% vs comparator 7.0%; relative risk 0.97, 95% CI 0.75-1.26). Analyses of all secondary outcomes also showed no evidence of a benefit. Similar results were observed on complete cases and using multiple imputation. When the intervention group was restricted to those who took up the offer of the app (n=395, 25.3%), participants in the intervention group were 80% more likely to report 6-month continuous abstinence (12.7% vs 7.0%; relative risk 1.80, 95% CI 1.30-2.45). Equivalent subgroup analyses produced similar results on the secondary outcomes. These differences persisted after adjustment for key baseline characteristics.
Conclusions: Among motivated smokers provided with very brief advice to quit, the offer of the Smoke Free app did not have a detectable benefit for cessation compared with follow-up only. However, the app increased quit rates when smokers randomized to receive the app downloaded it.
{"title":"Effectiveness of the Offer of the Smoke Free Smartphone App Compared With No Intervention for Smoking Cessation: Pragmatic Randomized Controlled Trial.","authors":"Sarah Jackson, Dimitra Kale, Emma Beard, Olga Perski, Robert West, Jamie Brown","doi":"10.2196/50963","DOIUrl":"https://doi.org/10.2196/50963","url":null,"abstract":"<p><strong>Background: </strong>Digital technologies offer the potential for low-cost, scalable delivery of interventions to promote smoking cessation.</p><p><strong>Objective: </strong>We aimed to evaluate the effectiveness of the offer of Smoke Free-an evidence-informed, widely used app-for smoking cessation versus no support.</p><p><strong>Methods: </strong>In this 2-arm randomized controlled trial, 3143 motivated adult smokers were recruited online between August 2020 and April 2021 and randomized to receive an offer of the Smoke Free app plus follow-up (intervention arm) versus follow-up only (comparator arm). Both groups were shown a brief message at the end of the baseline questionnaire encouraging them to make a quit attempt. The primary outcome was self-reported 6-month continuous abstinence assessed 7 months after randomization. Secondary outcomes included quit attempts in the first month post randomization, 3-month continuous abstinence assessed at 4 months, and 6-month continuous abstinence at 7 months among those who made a quit attempt. The primary analysis was performed on an intention-to-treat (ITT) analysis basis. Sensitivity analyses included (1) restricting the intervention group to those who took up the offer of the app, (2) using complete cases, and (3) using multiple imputation.</p><p><strong>Results: </strong>The effective follow-up rate for 7 months was 41.9%. The primary analysis showed no evidence of a benefit of the intervention on rates of 6-month continuous abstinence (intervention 6.8% vs comparator 7.0%; relative risk 0.97, 95% CI 0.75-1.26). Analyses of all secondary outcomes also showed no evidence of a benefit. Similar results were observed on complete cases and using multiple imputation. When the intervention group was restricted to those who took up the offer of the app (n=395, 25.3%), participants in the intervention group were 80% more likely to report 6-month continuous abstinence (12.7% vs 7.0%; relative risk 1.80, 95% CI 1.30-2.45). Equivalent subgroup analyses produced similar results on the secondary outcomes. These differences persisted after adjustment for key baseline characteristics.</p><p><strong>Conclusions: </strong>Among motivated smokers provided with very brief advice to quit, the offer of the Smoke Free app did not have a detectable benefit for cessation compared with follow-up only. However, the app increased quit rates when smokers randomized to receive the app downloaded it.</p><p><strong>Trial registration: </strong>ISRCTN ISRCTN85785540; https://www.isrctn.com/ISRCTN85785540.</p><p><strong>International registered report identifier (irrid): </strong>RR2-https://onlinelibrary.wiley.com/doi/full/10.1111/add.14652.</p>","PeriodicalId":16337,"journal":{"name":"Journal of Medical Internet Research","volume":"26 ","pages":"e50963"},"PeriodicalIF":5.8,"publicationDate":"2024-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142638742","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Background: As crowdfunding sites proliferate, visual content often serves as the initial bridge connecting a project to its potential backers, underscoring the importance of image selection in effectively engaging an audience.
Objective: This paper aims to explore the relationship between images and crowdfunding success in cancer-related crowdfunding projects.
Methods: We used the Alibaba Cloud platform to detect individual features in images. In addition, we used the Recognize Anything Model to label images and obtain content tags. Furthermore, the discourse atomic topic model was used to generate image topics. After obtaining the image features and image content topics, we built regression models to investigate the factors that influence the results of crowdfunding success.
Results: Images with a higher proportion of young people (β=0.0753; P<.001), a larger number of people (β=0.00822; P<.001), and a larger proportion of smiling faces (β=0.0446; P<.001) had a higher success rate. Image content related to good things and patient health also contributed to crowdfunding success (β=0.082, P<.001; and β=0.036, P<.001, respectively). In addition, the interaction between image topics and image characteristics had a significant effect on the final fundraising outcome. For example, when smiling faces are considered in conjunction with the image topics, using more smiling faces in the rest and play theme increased the amount of money raised (β=0.0152; P<.001). We also examined causality through a counterfactual analysis, which confirmed the influence of the variables on crowdfunding success, consistent with the results of our regression models.
Conclusions: In the realm of web-based medical crowdfunding, the importance of uploaded images cannot be overstated. Image characteristics, including the number of people depicted and the presence of youth, significantly improve fundraising results. In addition, the thematic choice of images in cancer crowdfunding efforts has a profound impact. Images that evoke beauty and resonate with health issues are more likely to result in increased donations. However, it is critical to recognize that reinforcing character traits in images of different themes has different effects on the success of crowdfunding campaigns.
背景:随着众筹网站的激增,视觉内容往往成为连接项目与潜在支持者的最初桥梁,这凸显了图片选择在有效吸引受众方面的重要性:本文旨在探讨癌症相关众筹项目中图片与众筹成功之间的关系:我们使用阿里巴巴云平台检测图片中的个体特征。此外,我们还使用 "Recognize Anything Model "对图片进行标注并获取内容标签。此外,我们还使用了话语原子主题模型来生成图片主题。在获得图片特征和图片内容主题后,我们建立了回归模型来研究影响众筹成功结果的因素:结果:年轻人比例较高的图片(β=0.0753;PConclusions:在网络医疗众筹领域,上传图片的重要性怎么强调都不为过。图片的特征,包括被描绘的人数和是否有年轻人,能显著提高筹款结果。此外,癌症众筹中图片的主题选择也有深远影响。能唤起美感并与健康问题产生共鸣的图片更有可能增加捐款。不过,必须认识到,在不同主题的图片中强化人物特征对众筹活动的成功具有不同的影响。
{"title":"Impact of Image Content on Medical Crowdfunding Success: A Machine Learning Approach.","authors":"Renwu Wang, Huimin Xu, Xupin Zhang","doi":"10.2196/58617","DOIUrl":"10.2196/58617","url":null,"abstract":"<p><strong>Background: </strong>As crowdfunding sites proliferate, visual content often serves as the initial bridge connecting a project to its potential backers, underscoring the importance of image selection in effectively engaging an audience.</p><p><strong>Objective: </strong>This paper aims to explore the relationship between images and crowdfunding success in cancer-related crowdfunding projects.</p><p><strong>Methods: </strong>We used the Alibaba Cloud platform to detect individual features in images. In addition, we used the Recognize Anything Model to label images and obtain content tags. Furthermore, the discourse atomic topic model was used to generate image topics. After obtaining the image features and image content topics, we built regression models to investigate the factors that influence the results of crowdfunding success.</p><p><strong>Results: </strong>Images with a higher proportion of young people (β=0.0753; P<.001), a larger number of people (β=0.00822; P<.001), and a larger proportion of smiling faces (β=0.0446; P<.001) had a higher success rate. Image content related to good things and patient health also contributed to crowdfunding success (β=0.082, P<.001; and β=0.036, P<.001, respectively). In addition, the interaction between image topics and image characteristics had a significant effect on the final fundraising outcome. For example, when smiling faces are considered in conjunction with the image topics, using more smiling faces in the rest and play theme increased the amount of money raised (β=0.0152; P<.001). We also examined causality through a counterfactual analysis, which confirmed the influence of the variables on crowdfunding success, consistent with the results of our regression models.</p><p><strong>Conclusions: </strong>In the realm of web-based medical crowdfunding, the importance of uploaded images cannot be overstated. Image characteristics, including the number of people depicted and the presence of youth, significantly improve fundraising results. In addition, the thematic choice of images in cancer crowdfunding efforts has a profound impact. Images that evoke beauty and resonate with health issues are more likely to result in increased donations. However, it is critical to recognize that reinforcing character traits in images of different themes has different effects on the success of crowdfunding campaigns.</p>","PeriodicalId":16337,"journal":{"name":"Journal of Medical Internet Research","volume":"26 ","pages":"e58617"},"PeriodicalIF":5.8,"publicationDate":"2024-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142638805","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
<p><strong>Background: </strong>To accurately capture an individual's food intake, dietitians are often required to ask clients about their food frequencies and portions, and they have to rely on the client's memory, which can be burdensome. While taking food photos alongside food records can alleviate user burden and reduce errors in self-reporting, this method still requires trained staff to translate food photos into dietary intake data. Image-assisted dietary assessment (IADA) is an innovative approach that uses computer algorithms to mimic human performance in estimating dietary information from food images. This field has seen continuous improvement through advancements in computer science, particularly in artificial intelligence (AI). However, the technical nature of this field can make it challenging for those without a technical background to understand it completely.</p><p><strong>Objective: </strong>This review aims to fill the gap by providing a current overview of AI's integration into dietary assessment using food images. The content is organized chronologically and presented in an accessible manner for those unfamiliar with AI terminology. In addition, we discuss the systems' strengths and weaknesses and propose enhancements to improve IADA's accuracy and adoption in the nutrition community.</p><p><strong>Methods: </strong>This scoping review used PubMed and Google Scholar databases to identify relevant studies. The review focused on computational techniques used in IADA, specifically AI models, devices, and sensors, or digital methods for food recognition and food volume estimation published between 2008 and 2021.</p><p><strong>Results: </strong>A total of 522 articles were initially identified. On the basis of a rigorous selection process, 84 (16.1%) articles were ultimately included in this review. The selected articles reveal that early systems, developed before 2015, relied on handcrafted machine learning algorithms to manage traditional sequential processes, such as segmentation, food identification, portion estimation, and nutrient calculations. Since 2015, these handcrafted algorithms have been largely replaced by deep learning algorithms for handling the same tasks. More recently, the traditional sequential process has been superseded by advanced algorithms, including multitask convolutional neural networks and generative adversarial networks. Most of the systems were validated for macronutrient and energy estimation, while only a few were capable of estimating micronutrients, such as sodium. Notably, significant advancements have been made in the field of IADA, with efforts focused on replicating humanlike performance.</p><p><strong>Conclusions: </strong>This review highlights the progress made by IADA, particularly in the areas of food identification and portion estimation. Advancements in AI techniques have shown great potential to improve the accuracy and efficiency of this field. However, it is crucial to involve diet
{"title":"Advancements in Using AI for Dietary Assessment Based on Food Images: Scoping Review.","authors":"Phawinpon Chotwanvirat, Aree Prachansuwan, Pimnapanut Sridonpai, Wantanee Kriengsinyos","doi":"10.2196/51432","DOIUrl":"https://doi.org/10.2196/51432","url":null,"abstract":"<p><strong>Background: </strong>To accurately capture an individual's food intake, dietitians are often required to ask clients about their food frequencies and portions, and they have to rely on the client's memory, which can be burdensome. While taking food photos alongside food records can alleviate user burden and reduce errors in self-reporting, this method still requires trained staff to translate food photos into dietary intake data. Image-assisted dietary assessment (IADA) is an innovative approach that uses computer algorithms to mimic human performance in estimating dietary information from food images. This field has seen continuous improvement through advancements in computer science, particularly in artificial intelligence (AI). However, the technical nature of this field can make it challenging for those without a technical background to understand it completely.</p><p><strong>Objective: </strong>This review aims to fill the gap by providing a current overview of AI's integration into dietary assessment using food images. The content is organized chronologically and presented in an accessible manner for those unfamiliar with AI terminology. In addition, we discuss the systems' strengths and weaknesses and propose enhancements to improve IADA's accuracy and adoption in the nutrition community.</p><p><strong>Methods: </strong>This scoping review used PubMed and Google Scholar databases to identify relevant studies. The review focused on computational techniques used in IADA, specifically AI models, devices, and sensors, or digital methods for food recognition and food volume estimation published between 2008 and 2021.</p><p><strong>Results: </strong>A total of 522 articles were initially identified. On the basis of a rigorous selection process, 84 (16.1%) articles were ultimately included in this review. The selected articles reveal that early systems, developed before 2015, relied on handcrafted machine learning algorithms to manage traditional sequential processes, such as segmentation, food identification, portion estimation, and nutrient calculations. Since 2015, these handcrafted algorithms have been largely replaced by deep learning algorithms for handling the same tasks. More recently, the traditional sequential process has been superseded by advanced algorithms, including multitask convolutional neural networks and generative adversarial networks. Most of the systems were validated for macronutrient and energy estimation, while only a few were capable of estimating micronutrients, such as sodium. Notably, significant advancements have been made in the field of IADA, with efforts focused on replicating humanlike performance.</p><p><strong>Conclusions: </strong>This review highlights the progress made by IADA, particularly in the areas of food identification and portion estimation. Advancements in AI techniques have shown great potential to improve the accuracy and efficiency of this field. However, it is crucial to involve diet","PeriodicalId":16337,"journal":{"name":"Journal of Medical Internet Research","volume":"26 ","pages":"e51432"},"PeriodicalIF":5.8,"publicationDate":"2024-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142638481","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
<p><strong>Background: </strong>Although physical activity (PA) has positive effects on health and well-being, physical inactivity is a worldwide problem. Mobile health interventions have been shown to be effective in promoting PA. Personalizing persuasive strategies improves intervention success and can be conducted using machine learning (ML). For PA, several studies have addressed personalized persuasive strategies without ML, whereas others have included personalization using ML without focusing on persuasive strategies. An overview of studies discussing ML to personalize persuasive strategies in PA-promoting interventions and corresponding categorizations could be helpful for such interventions to be designed in the future but is still missing.</p><p><strong>Objective: </strong>First, we aimed to provide an overview of implemented ML techniques to personalize persuasive strategies in mobile health interventions promoting PA. Moreover, we aimed to present a categorization overview as a starting point for applying ML techniques in this field.</p><p><strong>Methods: </strong>A scoping review was conducted based on the framework by Arksey and O'Malley and the PRISMA-ScR (Preferred Reporting Items for Systematic Reviews and Meta-Analyses extension for Scoping Reviews) criteria. Scopus, Web of Science, and PubMed were searched for studies that included ML to personalize persuasive strategies in interventions promoting PA. Papers were screened using the ASReview software. From the included papers, categorized by the research project they belonged to, we extracted data regarding general study information, target group, PA intervention, implemented technology, and study details. On the basis of the analysis of these data, a categorization overview was given.</p><p><strong>Results: </strong>In total, 40 papers belonging to 27 different projects were included. These papers could be categorized in 4 groups based on their dimension of personalization. Then, for each dimension, 1 or 2 persuasive strategy categories were found together with a type of ML. The overview resulted in a categorization consisting of 3 levels: dimension of personalization, persuasive strategy, and type of ML. When personalizing the timing of the messages, most projects implemented reinforcement learning to personalize the timing of reminders and supervised learning (SL) to personalize the timing of feedback, monitoring, and goal-setting messages. Regarding the content of the messages, most projects implemented SL to personalize PA suggestions and feedback or educational messages. For personalizing PA suggestions, SL can be implemented either alone or combined with a recommender system. Finally, reinforcement learning was mostly used to personalize the type of feedback messages.</p><p><strong>Conclusions: </strong>The overview of all implemented persuasive strategies and their corresponding ML methods is insightful for this interdisciplinary field. Moreover, it led to a categorizat
背景:虽然体力活动(PA)对健康和幸福有积极影响,但缺乏体力活动却是一个世界性问题。移动健康干预已被证明能有效促进体育锻炼。个性化说服策略可提高干预的成功率,并可通过机器学习(ML)进行。对于锻炼,一些研究在不使用 ML 的情况下讨论了个性化说服策略,而另一些研究则在不关注说服策略的情况下使用 ML 进行了个性化。关于在促进 PA 的干预措施中使用 ML 对说服策略进行个性化设计的研究综述以及相应的分类,可能有助于今后设计此类干预措施,但目前仍缺乏此类综述:首先,我们旨在概述在促进 PA 的移动健康干预中个性化说服策略的 ML 技术。此外,我们还旨在提供一个分类概述,作为在该领域应用 ML 技术的起点:方法:根据 Arksey 和 O'Malley 提出的框架以及 PRISMA-ScR(系统综述和 Meta 分析首选报告项目扩展用于范围界定综述)标准进行了范围界定综述。我们在 Scopus、Web of Science 和 PubMed 上搜索了在促进 PA 的干预措施中包含 ML 个性化说服策略的研究。使用 ASReview 软件对论文进行筛选。我们从收录的论文中提取了有关一般研究信息、目标群体、PA 干预、实施技术和研究细节的数据,并按其所属的研究项目进行了分类。在对这些数据进行分析的基础上,我们给出了分类概述:共收录了属于 27 个不同项目的 40 篇论文。这些论文可根据其个性化维度分为 4 组。然后,针对每个维度,找到 1 或 2 个说服策略类别以及一种 ML 类型。综上所述,该分类包括 3 个层次:个性化维度、说服策略和 ML 类型。在个性化信息发布时间方面,大多数项目通过强化学习来个性化提醒信息的发布时间,通过监督学习(SL)来个性化反馈、监控和目标设定信息的发布时间。在信息内容方面,大多数项目都采用了监督学习(SL)来个性化心理咨询建议、反馈或教育信息。在个性化 PA 建议方面,SL 可以单独使用,也可以与推荐系统结合使用。最后,强化学习大多用于个性化反馈信息的类型:对所有已实施的说服策略及其相应的 ML 方法的概述,对这一跨学科领域具有深刻的启发意义。此外,它还提供了一个分类概览,为设计和开发个性化说服策略以促进PA提供了启示。在未来的论文中,该分类概述可能会扩展到更多层次,以指定 ML 方法或个性化和说服策略的其他维度。
{"title":"Machine Learning Methods to Personalize Persuasive Strategies in mHealth Interventions That Promote Physical Activity: Scoping Review and Categorization Overview.","authors":"Annette Brons, Shihan Wang, Bart Visser, Ben Kröse, Sander Bakkes, Remco Veltkamp","doi":"10.2196/47774","DOIUrl":"10.2196/47774","url":null,"abstract":"<p><strong>Background: </strong>Although physical activity (PA) has positive effects on health and well-being, physical inactivity is a worldwide problem. Mobile health interventions have been shown to be effective in promoting PA. Personalizing persuasive strategies improves intervention success and can be conducted using machine learning (ML). For PA, several studies have addressed personalized persuasive strategies without ML, whereas others have included personalization using ML without focusing on persuasive strategies. An overview of studies discussing ML to personalize persuasive strategies in PA-promoting interventions and corresponding categorizations could be helpful for such interventions to be designed in the future but is still missing.</p><p><strong>Objective: </strong>First, we aimed to provide an overview of implemented ML techniques to personalize persuasive strategies in mobile health interventions promoting PA. Moreover, we aimed to present a categorization overview as a starting point for applying ML techniques in this field.</p><p><strong>Methods: </strong>A scoping review was conducted based on the framework by Arksey and O'Malley and the PRISMA-ScR (Preferred Reporting Items for Systematic Reviews and Meta-Analyses extension for Scoping Reviews) criteria. Scopus, Web of Science, and PubMed were searched for studies that included ML to personalize persuasive strategies in interventions promoting PA. Papers were screened using the ASReview software. From the included papers, categorized by the research project they belonged to, we extracted data regarding general study information, target group, PA intervention, implemented technology, and study details. On the basis of the analysis of these data, a categorization overview was given.</p><p><strong>Results: </strong>In total, 40 papers belonging to 27 different projects were included. These papers could be categorized in 4 groups based on their dimension of personalization. Then, for each dimension, 1 or 2 persuasive strategy categories were found together with a type of ML. The overview resulted in a categorization consisting of 3 levels: dimension of personalization, persuasive strategy, and type of ML. When personalizing the timing of the messages, most projects implemented reinforcement learning to personalize the timing of reminders and supervised learning (SL) to personalize the timing of feedback, monitoring, and goal-setting messages. Regarding the content of the messages, most projects implemented SL to personalize PA suggestions and feedback or educational messages. For personalizing PA suggestions, SL can be implemented either alone or combined with a recommender system. Finally, reinforcement learning was mostly used to personalize the type of feedback messages.</p><p><strong>Conclusions: </strong>The overview of all implemented persuasive strategies and their corresponding ML methods is insightful for this interdisciplinary field. Moreover, it led to a categorizat","PeriodicalId":16337,"journal":{"name":"Journal of Medical Internet Research","volume":"26 ","pages":"e47774"},"PeriodicalIF":5.8,"publicationDate":"2024-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142638821","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Kelly Quinn, Sarah Leiser Ransom, Carrie O'Connell, Naoko Muramatsu, David X Marquez, Jessie Chin
[This corrects the article DOI: 10.2196/54800.].
[此处更正了文章 DOI:10.2196/54800]。
{"title":"Correction: Assessing the Feasibility and Acceptability of Smart Speakers in Behavioral Intervention Research With Older Adults: Mixed Methods Study.","authors":"Kelly Quinn, Sarah Leiser Ransom, Carrie O'Connell, Naoko Muramatsu, David X Marquez, Jessie Chin","doi":"10.2196/66813","DOIUrl":"10.2196/66813","url":null,"abstract":"<p><p>[This corrects the article DOI: 10.2196/54800.].</p>","PeriodicalId":16337,"journal":{"name":"Journal of Medical Internet Research","volume":"26 ","pages":"e66813"},"PeriodicalIF":5.8,"publicationDate":"2024-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142638403","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Background: Telemedicine is expanding rapidly, with public direct-to-consumer (DTC) telemedicine representing 70% of the market. A key priority is establishing clear quality distinctions between the public and private sectors. No studies have directly compared the quality of DTC telemedicine in the public and private sectors using objective evaluation methods.
Objective: Using a standardized patient (SP) approach, this study aimed to compare the quality of DTC telemedicine provided by China's public and private sectors.
Methods: We recruited 10 SPs presenting fixed cases (urticaria and childhood diarrhea), with 594 interactions between them and physicians. The SPs evaluated various aspects of the quality of care, effectiveness, safety, patient-centeredness (PCC), efficiency, and timeliness using the Institute of Medicine (IOM) quality framework. Ordinary least-squares (OLS) regression models with fixed effects were used for continuous variables, while logistic regression models with fixed effects were used for categorical variables.
Results: Significant quality differences were observed between public and private DTC telemedicine. Physicians from private platforms were significantly more likely to adhere to clinical checklists (adjusted β 15.22, P<.001); provide an accurate diagnosis (adjusted odds ratio [OR] 3.85, P<.001), an appropriate prescription (adjusted OR 3.87, P<.001), and lifestyle modification advice (adjusted OR 6.82, P<.001); ensure more PCC (adjusted β 3.34, P<.001); and spend more time with SPs (adjusted β 839.70, P<.001), with more responses (adjusted β 1.33, P=.001) and more words (adjusted β 50.93, P=.009). However, SPs on private platforms waited longer for the first response (adjusted β 505.87, P=.001) and each response (adjusted β 168.33, P=.04) and paid more for the average visit (adjusted β 40.03, P<.001).
Conclusions: There is significant quality inequality in different DTC telemedicine platforms. Private physicians might provide a higher quality of service regarding effectiveness and safety, PCC, and response times and words. However, private platforms have longer wait times for their first response, as well as higher costs. Refining online reviews, establishing standardized norms and pricing, enhancing the performance evaluation mechanism for public DTC telemedicine, and imposing stricter limitations on the first response time for private physicians should be considered practical approaches to optimizing the management of DTC telemedicine.
{"title":"Comparing the Quality of Direct-to-Consumer Telemedicine Dominated and Delivered by Public and Private Sector Platforms in China: Standardized Patient Study.","authors":"Faying Song, Xue Gong, Yuting Yang, Rui Guo","doi":"10.2196/55400","DOIUrl":"https://doi.org/10.2196/55400","url":null,"abstract":"<p><strong>Background: </strong>Telemedicine is expanding rapidly, with public direct-to-consumer (DTC) telemedicine representing 70% of the market. A key priority is establishing clear quality distinctions between the public and private sectors. No studies have directly compared the quality of DTC telemedicine in the public and private sectors using objective evaluation methods.</p><p><strong>Objective: </strong>Using a standardized patient (SP) approach, this study aimed to compare the quality of DTC telemedicine provided by China's public and private sectors.</p><p><strong>Methods: </strong>We recruited 10 SPs presenting fixed cases (urticaria and childhood diarrhea), with 594 interactions between them and physicians. The SPs evaluated various aspects of the quality of care, effectiveness, safety, patient-centeredness (PCC), efficiency, and timeliness using the Institute of Medicine (IOM) quality framework. Ordinary least-squares (OLS) regression models with fixed effects were used for continuous variables, while logistic regression models with fixed effects were used for categorical variables.</p><p><strong>Results: </strong>Significant quality differences were observed between public and private DTC telemedicine. Physicians from private platforms were significantly more likely to adhere to clinical checklists (adjusted β 15.22, P<.001); provide an accurate diagnosis (adjusted odds ratio [OR] 3.85, P<.001), an appropriate prescription (adjusted OR 3.87, P<.001), and lifestyle modification advice (adjusted OR 6.82, P<.001); ensure more PCC (adjusted β 3.34, P<.001); and spend more time with SPs (adjusted β 839.70, P<.001), with more responses (adjusted β 1.33, P=.001) and more words (adjusted β 50.93, P=.009). However, SPs on private platforms waited longer for the first response (adjusted β 505.87, P=.001) and each response (adjusted β 168.33, P=.04) and paid more for the average visit (adjusted β 40.03, P<.001).</p><p><strong>Conclusions: </strong>There is significant quality inequality in different DTC telemedicine platforms. Private physicians might provide a higher quality of service regarding effectiveness and safety, PCC, and response times and words. However, private platforms have longer wait times for their first response, as well as higher costs. Refining online reviews, establishing standardized norms and pricing, enhancing the performance evaluation mechanism for public DTC telemedicine, and imposing stricter limitations on the first response time for private physicians should be considered practical approaches to optimizing the management of DTC telemedicine.</p>","PeriodicalId":16337,"journal":{"name":"Journal of Medical Internet Research","volume":"26 ","pages":"e55400"},"PeriodicalIF":5.8,"publicationDate":"2024-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142621207","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Shuai Ming, Xi Yao, Xiaohong Guo, Qingge Guo, Kunpeng Xie, Dandan Chen, Bo Lei
Background: Artificial intelligence (AI) chatbots such as ChatGPT are expected to impact vision health care significantly. Their potential to optimize the consultation process and diagnostic capabilities across range of ophthalmic subspecialties have yet to be fully explored.
Objective: This study aims to investigate the performance of AI chatbots in recommending ophthalmic outpatient registration and diagnosing eye diseases within clinical case profiles.
Methods: This cross-sectional study used clinical cases from Chinese Standardized Resident Training-Ophthalmology (2nd Edition). For each case, 2 profiles were created: patient with history (Hx) and patient with history and examination (Hx+Ex). These profiles served as independent queries for GPT-3.5 and GPT-4.0 (accessed from March 5 to 18, 2024). Similarly, 3 ophthalmic residents were posed the same profiles in a questionnaire format. The accuracy of recommending ophthalmic subspecialty registration was primarily evaluated using Hx profiles. The accuracy of the top-ranked diagnosis and the accuracy of the diagnosis within the top 3 suggestions (do-not-miss diagnosis) were assessed using Hx+Ex profiles. The gold standard for judgment was the published, official diagnosis. Characteristics of incorrect diagnoses by ChatGPT were also analyzed.
Results: A total of 208 clinical profiles from 12 ophthalmic subspecialties were analyzed (104 Hx and 104 Hx+Ex profiles). For Hx profiles, GPT-3.5, GPT-4.0, and residents showed comparable accuracy in registration suggestions (66/104, 63.5%; 81/104, 77.9%; and 72/104, 69.2%, respectively; P=.07), with ocular trauma, retinal diseases, and strabismus and amblyopia achieving the top 3 accuracies. For Hx+Ex profiles, both GPT-4.0 and residents demonstrated higher diagnostic accuracy than GPT-3.5 (62/104, 59.6% and 63/104, 60.6% vs 41/104, 39.4%; P=.003 and P=.001, respectively). Accuracy for do-not-miss diagnoses also improved (79/104, 76% and 68/104, 65.4% vs 51/104, 49%; P<.001 and P=.02, respectively). The highest diagnostic accuracies were observed in glaucoma; lens diseases; and eyelid, lacrimal, and orbital diseases. GPT-4.0 recorded fewer incorrect top-3 diagnoses (25/42, 60% vs 53/63, 84%; P=.005) and more partially correct diagnoses (21/42, 50% vs 7/63 11%; P<.001) than GPT-3.5, while GPT-3.5 had more completely incorrect (27/63, 43% vs 7/42, 17%; P=.005) and less precise diagnoses (22/63, 35% vs 5/42, 12%; P=.009).
Conclusions: GPT-3.5 and GPT-4.0 showed intermediate performance in recommending ophthalmic subspecialties for registration. While GPT-3.5 underperformed, GPT-4.0 approached and numerically surpassed residents in differential diagnosis. AI chatbots show promise in facilitating ophthalmic patient registration. However, their integration into diagnostic decision-making requires more validation.
{"title":"Performance of ChatGPT in Ophthalmic Registration and Clinical Diagnosis: Cross-Sectional Study.","authors":"Shuai Ming, Xi Yao, Xiaohong Guo, Qingge Guo, Kunpeng Xie, Dandan Chen, Bo Lei","doi":"10.2196/60226","DOIUrl":"https://doi.org/10.2196/60226","url":null,"abstract":"<p><strong>Background: </strong>Artificial intelligence (AI) chatbots such as ChatGPT are expected to impact vision health care significantly. Their potential to optimize the consultation process and diagnostic capabilities across range of ophthalmic subspecialties have yet to be fully explored.</p><p><strong>Objective: </strong>This study aims to investigate the performance of AI chatbots in recommending ophthalmic outpatient registration and diagnosing eye diseases within clinical case profiles.</p><p><strong>Methods: </strong>This cross-sectional study used clinical cases from Chinese Standardized Resident Training-Ophthalmology (2nd Edition). For each case, 2 profiles were created: patient with history (Hx) and patient with history and examination (Hx+Ex). These profiles served as independent queries for GPT-3.5 and GPT-4.0 (accessed from March 5 to 18, 2024). Similarly, 3 ophthalmic residents were posed the same profiles in a questionnaire format. The accuracy of recommending ophthalmic subspecialty registration was primarily evaluated using Hx profiles. The accuracy of the top-ranked diagnosis and the accuracy of the diagnosis within the top 3 suggestions (do-not-miss diagnosis) were assessed using Hx+Ex profiles. The gold standard for judgment was the published, official diagnosis. Characteristics of incorrect diagnoses by ChatGPT were also analyzed.</p><p><strong>Results: </strong>A total of 208 clinical profiles from 12 ophthalmic subspecialties were analyzed (104 Hx and 104 Hx+Ex profiles). For Hx profiles, GPT-3.5, GPT-4.0, and residents showed comparable accuracy in registration suggestions (66/104, 63.5%; 81/104, 77.9%; and 72/104, 69.2%, respectively; P=.07), with ocular trauma, retinal diseases, and strabismus and amblyopia achieving the top 3 accuracies. For Hx+Ex profiles, both GPT-4.0 and residents demonstrated higher diagnostic accuracy than GPT-3.5 (62/104, 59.6% and 63/104, 60.6% vs 41/104, 39.4%; P=.003 and P=.001, respectively). Accuracy for do-not-miss diagnoses also improved (79/104, 76% and 68/104, 65.4% vs 51/104, 49%; P<.001 and P=.02, respectively). The highest diagnostic accuracies were observed in glaucoma; lens diseases; and eyelid, lacrimal, and orbital diseases. GPT-4.0 recorded fewer incorrect top-3 diagnoses (25/42, 60% vs 53/63, 84%; P=.005) and more partially correct diagnoses (21/42, 50% vs 7/63 11%; P<.001) than GPT-3.5, while GPT-3.5 had more completely incorrect (27/63, 43% vs 7/42, 17%; P=.005) and less precise diagnoses (22/63, 35% vs 5/42, 12%; P=.009).</p><p><strong>Conclusions: </strong>GPT-3.5 and GPT-4.0 showed intermediate performance in recommending ophthalmic subspecialties for registration. While GPT-3.5 underperformed, GPT-4.0 approached and numerically surpassed residents in differential diagnosis. AI chatbots show promise in facilitating ophthalmic patient registration. However, their integration into diagnostic decision-making requires more validation.</p>","PeriodicalId":16337,"journal":{"name":"Journal of Medical Internet Research","volume":"26 ","pages":"e60226"},"PeriodicalIF":5.8,"publicationDate":"2024-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142621888","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Radha Nagarajan, Midori Kondo, Franz Salas, Emre Sezgin, Yuan Yao, Vanessa Klotzman, Sandip A Godambe, Naqi Khan, Alfonso Limon, Graham Stephenson, Sharief Taraman, Nephi Walton, Louis Ehwerhemuepha, Jay Pandit, Deepti Pandita, Michael Weiss, Charles Golden, Adam Gold, John Henderson, Angela Shippy, Leo Anthony Celi, William R Hogan, Eric K Oermann, Terence Sanger, Steven Martel
<p><p>Large language models (LLMs) continue to exhibit noteworthy capabilities across a spectrum of areas, including emerging proficiencies across the health care continuum. Successful LLM implementation and adoption depend on digital readiness, modern infrastructure, a trained workforce, privacy, and an ethical regulatory landscape. These factors can vary significantly across health care ecosystems, dictating the choice of a particular LLM implementation pathway. This perspective discusses 3 LLM implementation pathways-training from scratch pathway (TSP), fine-tuned pathway (FTP), and out-of-the-box pathway (OBP)-as potential onboarding points for health systems while facilitating equitable adoption. The choice of a particular pathway is governed by needs as well as affordability. Therefore, the risks, benefits, and economics of these pathways across 4 major cloud service providers (Amazon, Microsoft, Google, and Oracle) are presented. While cost comparisons, such as on-demand and spot pricing across the cloud service providers for the 3 pathways, are presented for completeness, the usefulness of managed services and cloud enterprise tools is elucidated. Managed services can complement the traditional workforce and expertise, while enterprise tools, such as federated learning, can overcome sample size challenges when implementing LLMs using health care data. Of the 3 pathways, TSP is expected to be the most resource-intensive regarding infrastructure and workforce while providing maximum customization, enhanced transparency, and performance. Because TSP trains the LLM using enterprise health care data, it is expected to harness the digital signatures of the population served by the health care system with the potential to impact outcomes. The use of pretrained models in FTP is a limitation. It may impact its performance because the training data used in the pretrained model may have hidden bias and may not necessarily be health care-related. However, FTP provides a balance between customization, cost, and performance. While OBP can be rapidly deployed, it provides minimal customization and transparency without guaranteeing long-term availability. OBP may also present challenges in interfacing seamlessly with downstream applications in health care settings with variations in pricing and use over time. Lack of customization in OBP can significantly limit its ability to impact outcomes. Finally, potential applications of LLMs in health care, including conversational artificial intelligence, chatbots, summarization, and machine translation, are highlighted. While the 3 implementation pathways discussed in this perspective have the potential to facilitate equitable adoption and democratization of LLMs, transitions between them may be necessary as the needs of health systems evolve. Understanding the economics and trade-offs of these onboarding pathways can guide their strategic adoption and demonstrate value while impacting health care outcomes favor
{"title":"Economics and Equity of Large Language Models: Health Care Perspective.","authors":"Radha Nagarajan, Midori Kondo, Franz Salas, Emre Sezgin, Yuan Yao, Vanessa Klotzman, Sandip A Godambe, Naqi Khan, Alfonso Limon, Graham Stephenson, Sharief Taraman, Nephi Walton, Louis Ehwerhemuepha, Jay Pandit, Deepti Pandita, Michael Weiss, Charles Golden, Adam Gold, John Henderson, Angela Shippy, Leo Anthony Celi, William R Hogan, Eric K Oermann, Terence Sanger, Steven Martel","doi":"10.2196/64226","DOIUrl":"https://doi.org/10.2196/64226","url":null,"abstract":"<p><p>Large language models (LLMs) continue to exhibit noteworthy capabilities across a spectrum of areas, including emerging proficiencies across the health care continuum. Successful LLM implementation and adoption depend on digital readiness, modern infrastructure, a trained workforce, privacy, and an ethical regulatory landscape. These factors can vary significantly across health care ecosystems, dictating the choice of a particular LLM implementation pathway. This perspective discusses 3 LLM implementation pathways-training from scratch pathway (TSP), fine-tuned pathway (FTP), and out-of-the-box pathway (OBP)-as potential onboarding points for health systems while facilitating equitable adoption. The choice of a particular pathway is governed by needs as well as affordability. Therefore, the risks, benefits, and economics of these pathways across 4 major cloud service providers (Amazon, Microsoft, Google, and Oracle) are presented. While cost comparisons, such as on-demand and spot pricing across the cloud service providers for the 3 pathways, are presented for completeness, the usefulness of managed services and cloud enterprise tools is elucidated. Managed services can complement the traditional workforce and expertise, while enterprise tools, such as federated learning, can overcome sample size challenges when implementing LLMs using health care data. Of the 3 pathways, TSP is expected to be the most resource-intensive regarding infrastructure and workforce while providing maximum customization, enhanced transparency, and performance. Because TSP trains the LLM using enterprise health care data, it is expected to harness the digital signatures of the population served by the health care system with the potential to impact outcomes. The use of pretrained models in FTP is a limitation. It may impact its performance because the training data used in the pretrained model may have hidden bias and may not necessarily be health care-related. However, FTP provides a balance between customization, cost, and performance. While OBP can be rapidly deployed, it provides minimal customization and transparency without guaranteeing long-term availability. OBP may also present challenges in interfacing seamlessly with downstream applications in health care settings with variations in pricing and use over time. Lack of customization in OBP can significantly limit its ability to impact outcomes. Finally, potential applications of LLMs in health care, including conversational artificial intelligence, chatbots, summarization, and machine translation, are highlighted. While the 3 implementation pathways discussed in this perspective have the potential to facilitate equitable adoption and democratization of LLMs, transitions between them may be necessary as the needs of health systems evolve. Understanding the economics and trade-offs of these onboarding pathways can guide their strategic adoption and demonstrate value while impacting health care outcomes favor","PeriodicalId":16337,"journal":{"name":"Journal of Medical Internet Research","volume":"26 ","pages":"e64226"},"PeriodicalIF":5.8,"publicationDate":"2024-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142621851","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}