JMIR AI

Pub Date : 2024-10-30 DOI: 10.2196/55059

Mathieu Ravaut, Ruochen Zhao, Duy Phung, Vicky Mengqi Qin, Dusan Milovanovic, Anita Pienkowska, Iva Bojic, Josip Car, Shafiq Joty

Background: Global pandemics like COVID-19 put a high amount of strain on health care systems and health workers worldwide. These crises generate a vast amount of news information published online across the globe. This extensive corpus of articles has the potential to provide valuable insights into the nature of ongoing events and guide interventions and policies. However, the sheer volume of information is beyond the capacity of human experts to process and analyze effectively.

Objective: The aim of this study was to explore how natural language processing (NLP) can be leveraged to build a system that allows for quick analysis of a high volume of news articles. Along with this, the objective was to create a workflow comprising human-computer symbiosis to derive valuable insights to support health workforce strategic policy dialogue, advocacy, and decision-making.

Methods: We conducted a review of open-source news coverage from January 2020 to June 2022 on COVID-19 and its impacts on the health workforce from the World Health Organization (WHO) Epidemic Intelligence from Open Sources (EIOS) by synergizing NLP models, including classification and extractive summarization, and human-generated analyses. Our DeepCovid system was trained on 2.8 million news articles in English from more than 3000 internet sources across hundreds of jurisdictions.

Results: Rules-based classification with hand-designed rules narrowed the data set to 8508 articles with high relevancy confirmed in the human-led evaluation. DeepCovid's automated information targeting component reached a very strong binary classification performance of 98.98 for the area under the receiver operating characteristic curve (ROC-AUC) and 47.21 for the area under the precision recall curve (PR-AUC). Its information extraction component attained good performance in automatic extractive summarization with a mean Recall-Oriented Understudy for Gisting Evaluation (ROUGE) score of 47.76. DeepCovid's final summaries were used by human experts to write reports on the COVID-19 pandemic.

Conclusions: It is feasible to synergize high-performing NLP models and human-generated analyses to benefit open-source health workforce intelligence. The DeepCovid approach can contribute to an agile and timely global view, providing complementary information to scientific literature.

背景：COVID-19 等全球性流行病给世界各地的医疗保健系统和医务工作者带来了巨大压力。这些危机在全球范围内产生了大量在线发布的新闻信息。这些大量的文章有可能为了解正在发生的事件的性质提供有价值的见解，并为干预措施和政策提供指导。然而，庞大的信息量超出了人类专家有效处理和分析的能力：本研究的目的是探索如何利用自然语言处理（NLP）来建立一个系统，以便对大量新闻文章进行快速分析。与此同时，我们的目标是创建一个包含人机共生的工作流程，以获得有价值的见解，从而支持卫生工作者的战略政策对话、宣传和决策：我们对世界卫生组织（WHO）公开来源流行病情报（EIOS）中 2020 年 1 月至 2022 年 6 月期间有关 COVID-19 及其对卫生工作者影响的公开来源新闻报道进行了审查，方法是协同 NLP 模型（包括分类和提取摘要）和人工生成的分析。我们的DeepCovid系统在来自数百个辖区3000多个互联网来源的280万篇英文新闻文章上进行了训练：结果：人工设计的基于规则的分类将数据集缩小到 8508 篇文章，这些文章的高相关性得到了人工评估的确认。DeepCovid 的自动信息定位组件在二元分类方面表现出色，接收者操作特征曲线下面积（ROC-AUC）为 98.98，精确召回曲线下面积（PR-AUC）为 47.21。其信息提取组件在自动提取摘要方面表现出色，面向召回的摘要评估（ROUGE）平均得分为 47.76。DeepCovid 的最终摘要被人类专家用于撰写 COVID-19 大流行病报告：将高性能的 NLP 模型与人类生成的分析协同起来，使开源卫生劳动力情报受益是可行的。DeepCovid方法有助于敏捷、及时地了解全球情况，为科学文献提供补充信息。

{"title":"Targeting COVID-19 and Human Resources for Health News Information Extraction: Algorithm Development and Validation.","authors":"Mathieu Ravaut, Ruochen Zhao, Duy Phung, Vicky Mengqi Qin, Dusan Milovanovic, Anita Pienkowska, Iva Bojic, Josip Car, Shafiq Joty","doi":"10.2196/55059","DOIUrl":"10.2196/55059","url":null,"abstract":"Background: Global pandemics like COVID-19 put a high amount of strain on health care systems and health workers worldwide. These crises generate a vast amount of news information published online across the globe. This extensive corpus of articles has the potential to provide valuable insights into the nature of ongoing events and guide interventions and policies. However, the sheer volume of information is beyond the capacity of human experts to process and analyze effectively.Objective: The aim of this study was to explore how natural language processing (NLP) can be leveraged to build a system that allows for quick analysis of a high volume of news articles. Along with this, the objective was to create a workflow comprising human-computer symbiosis to derive valuable insights to support health workforce strategic policy dialogue, advocacy, and decision-making.Methods: We conducted a review of open-source news coverage from January 2020 to June 2022 on COVID-19 and its impacts on the health workforce from the World Health Organization (WHO) Epidemic Intelligence from Open Sources (EIOS) by synergizing NLP models, including classification and extractive summarization, and human-generated analyses. Our DeepCovid system was trained on 2.8 million news articles in English from more than 3000 internet sources across hundreds of jurisdictions.Results: Rules-based classification with hand-designed rules narrowed the data set to 8508 articles with high relevancy confirmed in the human-led evaluation. DeepCovid's automated information targeting component reached a very strong binary classification performance of 98.98 for the area under the receiver operating characteristic curve (ROC-AUC) and 47.21 for the area under the precision recall curve (PR-AUC). Its information extraction component attained good performance in automatic extractive summarization with a mean Recall-Oriented Understudy for Gisting Evaluation (ROUGE) score of 47.76. DeepCovid's final summaries were used by human experts to write reports on the COVID-19 pandemic.Conclusions: It is feasible to synergize high-performing NLP models and human-generated analyses to benefit open-source health workforce intelligence. The DeepCovid approach can contribute to an agile and timely global view, providing complementary information to scientific literature.","PeriodicalId":73551,"journal":{"name":"JMIR AI","volume":"3 ","pages":"e55059"},"PeriodicalIF":0.0,"publicationDate":"2024-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11561429/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142549352","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Identifying Marijuana Use Behaviors Among Youth Experiencing Homelessness Using a Machine Learning-Based Framework: Development and Evaluation Study. 使用基于机器学习的框架识别无家可归青少年吸食大麻的行为：开发与评估研究。

JMIR AI

Pub Date : 2024-10-17 DOI: 10.2196/53488

Tianjie Deng, Andrew Urbaczewski, Young Jin Lee, Anamika Barman-Adhikari, Rinku Dewri

Background: Youth experiencing homelessness face substance use problems disproportionately compared to other youth. A study found that 69% of youth experiencing homelessness meet the criteria for dependence on at least 1 substance, compared to 1.8% for all US adolescents. In addition, they experience major structural and social inequalities, which further undermine their ability to receive the care they need.

Objective: The goal of this study was to develop a machine learning-based framework that uses the social media content (posts and interactions) of youth experiencing homelessness to predict their substance use behaviors (ie, the probability of using marijuana). With this framework, social workers and care providers can identify and reach out to youth experiencing homelessness who are at a higher risk of substance use.

Methods: We recruited 133 young people experiencing homelessness at a nonprofit organization located in a city in the western United States. After obtaining their consent, we collected the participants' social media conversations for the past year before they were recruited, and we asked the participants to complete a survey on their demographic information, health conditions, sexual behaviors, and substance use behaviors. Building on the social sharing of emotions theory and social support theory, we identified important features that can potentially predict substance use. Then, we used natural language processing techniques to extract such features from social media conversations and reactions and built a series of machine learning models to predict participants' marijuana use.

Results: We evaluated our models based on their predictive performance as well as their conformity with measures of fairness. Without predictive features from survey information, which may introduce sex and racial biases, our machine learning models can reach an area under the curve of 0.72 and an accuracy of 0.81 using only social media data when predicting marijuana use. We also evaluated the false-positive rate for each sex and age segment.

Conclusions: We showed that textual interactions among youth experiencing homelessness and their friends on social media can serve as a powerful resource to predict their substance use. The framework we developed allows care providers to allocate resources efficiently to youth experiencing homelessness in the greatest need while costing minimal overhead. It can be extended to analyze and predict other health-related behaviors and conditions observed in this vulnerable community.

背景：与其他青少年相比，无家可归的青少年面临着更多的药物使用问题。一项研究发现，69% 的无家可归青少年符合至少依赖一种药物的标准，而美国所有青少年的这一比例仅为 1.8%。此外，他们还经历着严重的结构性和社会性不平等，这进一步削弱了他们获得所需护理的能力：本研究的目标是开发一个基于机器学习的框架，利用无家可归青少年的社交媒体内容（帖子和互动）来预测他们的药物使用行为（即使用大麻的概率）。有了这个框架，社会工作者和医疗服务提供者就能识别并接触到药物使用风险较高的无家可归青少年：我们在美国西部城市的一家非营利组织招募了 133 名无家可归的青少年。在征得他们的同意后，我们收集了参与者在被招募前一年在社交媒体上的对话，并要求参与者完成一份关于其人口信息、健康状况、性行为和药物使用行为的调查。在情感社会共享理论和社会支持理论的基础上，我们确定了有可能预测药物使用的重要特征。然后，我们使用自然语言处理技术从社交媒体对话和反应中提取这些特征，并建立了一系列机器学习模型来预测参与者的大麻使用情况：我们根据模型的预测性能以及是否符合公平性标准对其进行了评估。如果没有来自调查信息的预测特征（调查信息可能会带来性别和种族偏见），我们的机器学习模型在预测大麻使用情况时，仅使用社交媒体数据就能达到 0.72 的曲线下面积和 0.81 的准确率。我们还评估了每个性别和年龄段的假阳性率：我们的研究表明，无家可归的青少年与其社交媒体上的朋友之间的文字互动可以作为预测其药物使用情况的有力资源。我们开发的框架允许医疗机构将资源有效地分配给最需要帮助的无家可归青年，同时将管理费用降到最低。该框架还可扩展用于分析和预测在这一弱势群体中观察到的其他健康相关行为和状况。

{"title":"Identifying Marijuana Use Behaviors Among Youth Experiencing Homelessness Using a Machine Learning-Based Framework: Development and Evaluation Study.","authors":"Tianjie Deng, Andrew Urbaczewski, Young Jin Lee, Anamika Barman-Adhikari, Rinku Dewri","doi":"10.2196/53488","DOIUrl":"10.2196/53488","url":null,"abstract":"Background: Youth experiencing homelessness face substance use problems disproportionately compared to other youth. A study found that 69% of youth experiencing homelessness meet the criteria for dependence on at least 1 substance, compared to 1.8% for all US adolescents. In addition, they experience major structural and social inequalities, which further undermine their ability to receive the care they need.Objective: The goal of this study was to develop a machine learning-based framework that uses the social media content (posts and interactions) of youth experiencing homelessness to predict their substance use behaviors (ie, the probability of using marijuana). With this framework, social workers and care providers can identify and reach out to youth experiencing homelessness who are at a higher risk of substance use.Methods: We recruited 133 young people experiencing homelessness at a nonprofit organization located in a city in the western United States. After obtaining their consent, we collected the participants' social media conversations for the past year before they were recruited, and we asked the participants to complete a survey on their demographic information, health conditions, sexual behaviors, and substance use behaviors. Building on the social sharing of emotions theory and social support theory, we identified important features that can potentially predict substance use. Then, we used natural language processing techniques to extract such features from social media conversations and reactions and built a series of machine learning models to predict participants' marijuana use.Results: We evaluated our models based on their predictive performance as well as their conformity with measures of fairness. Without predictive features from survey information, which may introduce sex and racial biases, our machine learning models can reach an area under the curve of 0.72 and an accuracy of 0.81 using only social media data when predicting marijuana use. We also evaluated the false-positive rate for each sex and age segment.Conclusions: We showed that textual interactions among youth experiencing homelessness and their friends on social media can serve as a powerful resource to predict their substance use. The framework we developed allows care providers to allocate resources efficiently to youth experiencing homelessness in the greatest need while costing minimal overhead. It can be extended to analyze and predict other health-related behaviors and conditions observed in this vulnerable community.","PeriodicalId":73551,"journal":{"name":"JMIR AI","volume":"3 ","pages":"e53488"},"PeriodicalIF":0.0,"publicationDate":"2024-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11528171/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142482400","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Machine Learning-Based Prediction for High Health Care Utilizers by Using a Multi-Institutional Diabetes Registry: Model Training and Evaluation. 利用多机构糖尿病登记处，基于机器学习预测医疗服务高利用率者：模型训练与评估。

JMIR AI

Pub Date : 2024-10-17 DOI: 10.2196/58463

Joshua Kuan Tan, Le Quan, Nur Nasyitah Mohamed Salim, Jen Hong Tan, Su-Yen Goh, Julian Thumboo, Yong Mong Bee

Background: The cost of health care in many countries is increasing rapidly. There is a growing interest in using machine learning for predicting high health care utilizers for population health initiatives. Previous studies have focused on individuals who contribute to the highest financial burden. However, this group is small and represents a limited opportunity for long-term cost reduction.

Objective: We developed a collection of models that predict future health care utilization at various thresholds.

Methods: We utilized data from a multi-institutional diabetes database from the year 2019 to develop binary classification models. These models predict health care utilization in the subsequent year across 6 different outcomes: patients having a length of stay of ≥7, ≥14, and ≥30 days and emergency department attendance of ≥3, ≥5, and ≥10 visits. To address class imbalance, random and synthetic minority oversampling techniques were employed. The models were then applied to unseen data from 2020 and 2021 to predict health care utilization in the following year. A portfolio of performance metrics, with priority on area under the receiver operating characteristic curve, sensitivity, and positive predictive value, was used for comparison. Explainability analyses were conducted on the best performing models.

Results: When trained with random oversampling, 4 models, that is, logistic regression, multivariate adaptive regression splines, boosted trees, and multilayer perceptron consistently achieved high area under the receiver operating characteristic curve (>0.80) and sensitivity (>0.60) across training-validation and test data sets. Correcting for class imbalance proved critical for model performance. Important predictors for all outcomes included age, number of emergency department visits in the present year, chronic kidney disease stage, inpatient bed days in the present year, and mean hemoglobin A_1c levels. Explainability analyses using partial dependence plots demonstrated that for the best performing models, the learned patterns were consistent with real-world knowledge, thereby supporting the validity of the models.

Conclusions: We successfully developed machine learning models capable of predicting high service level utilization with strong performance and valid explainability. These models can be integrated into wider diabetes-related population health initiatives.

背景：许多国家的医疗成本正在迅速增加。越来越多的人开始关注利用机器学习预测高医疗使用率的人群，以促进人口健康。以往的研究侧重于造成最高经济负担的个人。然而，这一群体人数较少，长期降低成本的机会有限：我们开发了一系列模型，可预测不同阈值下的未来医疗使用情况：我们利用多机构糖尿病数据库中 2019 年的数据开发了二元分类模型。这些模型通过 6 种不同的结果预测下一年的医疗利用率：住院时间≥7 天、≥14 天和≥30 天的患者，以及急诊就诊次数≥3 次、≥5 次和≥10 次的患者。为解决类不平衡问题，采用了随机和合成少数群体超采样技术。然后将模型应用于 2020 年和 2021 年的未见数据，以预测下一年的医疗利用率。为了进行比较，使用了一系列性能指标，重点是接收者工作特征曲线下面积、灵敏度和阳性预测值。对表现最好的模型进行了可解释性分析：当使用随机超采样进行训练时，4 个模型，即逻辑回归、多元自适应回归样条、助推树和多层感知器，在训练-验证和测试数据集上始终达到较高的接收者操作特征曲线下面积（>0.80）和灵敏度（>0.60）。事实证明，校正类别不平衡对模型性能至关重要。所有结果的重要预测因素包括年龄、当年急诊就诊次数、慢性肾脏病分期、当年住院天数和平均血红蛋白 A1c 水平。使用偏倚图进行的可解释性分析表明，对于表现最好的模型，学习到的模式与现实世界的知识是一致的，从而支持了模型的有效性：我们成功地开发了能够预测高服务水平利用率的机器学习模型，这些模型具有强大的性能和有效的可解释性。这些模型可以整合到更广泛的糖尿病相关人群健康计划中。

{"title":"Machine Learning-Based Prediction for High Health Care Utilizers by Using a Multi-Institutional Diabetes Registry: Model Training and Evaluation.","authors":"Joshua Kuan Tan, Le Quan, Nur Nasyitah Mohamed Salim, Jen Hong Tan, Su-Yen Goh, Julian Thumboo, Yong Mong Bee","doi":"10.2196/58463","DOIUrl":"10.2196/58463","url":null,"abstract":"Background: The cost of health care in many countries is increasing rapidly. There is a growing interest in using machine learning for predicting high health care utilizers for population health initiatives. Previous studies have focused on individuals who contribute to the highest financial burden. However, this group is small and represents a limited opportunity for long-term cost reduction.Objective: We developed a collection of models that predict future health care utilization at various thresholds.Methods: We utilized data from a multi-institutional diabetes database from the year 2019 to develop binary classification models. These models predict health care utilization in the subsequent year across 6 different outcomes: patients having a length of stay of ≥7, ≥14, and ≥30 days and emergency department attendance of ≥3, ≥5, and ≥10 visits. To address class imbalance, random and synthetic minority oversampling techniques were employed. The models were then applied to unseen data from 2020 and 2021 to predict health care utilization in the following year. A portfolio of performance metrics, with priority on area under the receiver operating characteristic curve, sensitivity, and positive predictive value, was used for comparison. Explainability analyses were conducted on the best performing models.Results: When trained with random oversampling, 4 models, that is, logistic regression, multivariate adaptive regression splines, boosted trees, and multilayer perceptron consistently achieved high area under the receiver operating characteristic curve (>0.80) and sensitivity (>0.60) across training-validation and test data sets. Correcting for class imbalance proved critical for model performance. Important predictors for all outcomes included age, number of emergency department visits in the present year, chronic kidney disease stage, inpatient bed days in the present year, and mean hemoglobin A1c levels. Explainability analyses using partial dependence plots demonstrated that for the best performing models, the learned patterns were consistent with real-world knowledge, thereby supporting the validity of the models.Conclusions: We successfully developed machine learning models capable of predicting high service level utilization with strong performance and valid explainability. These models can be integrated into wider diabetes-related population health initiatives.","PeriodicalId":73551,"journal":{"name":"JMIR AI","volume":"3 ","pages":"e58463"},"PeriodicalIF":0.0,"publicationDate":"2024-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11528163/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142482401","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Behavioral Nudging With Generative AI for Content Development in SMS Health Care Interventions: Case Study. 利用生成式人工智能进行行为引导，开发短信保健干预内容：案例研究。

JMIR AI

Pub Date : 2024-10-15 DOI: 10.2196/52974

Rachel M Harrison, Ekaterina Lapteva, Anton Bibin

Background: Brief message interventions have demonstrated immense promise in health care, yet the development of these messages has suffered from a dearth of transparency and a scarcity of publicly accessible data sets. Moreover, the researcher-driven content creation process has raised resource allocation issues, necessitating a more efficient and transparent approach to content development.

Objective: This research sets out to address the challenges of content development for SMS interventions by showcasing the use of generative artificial intelligence (AI) as a tool for content creation, transparently explaining the prompt design and content generation process, and providing the largest publicly available data set of brief messages and source code for future replication of our process.

Methods: Leveraging the pretrained large language model GPT-3.5 (OpenAI), we generate a collection of messages in the context of medication adherence for individuals with type 2 diabetes using evidence-derived behavior change techniques identified in a prior systematic review. We create an attributed prompt designed to adhere to content (readability and tone) and SMS (character count and encoder type) standards while encouraging message variability to reflect differences in behavior change techniques.

Results: We deliver the most extensive repository of brief messages for a singular health care intervention and the first library of messages crafted with generative AI. In total, our method yields a data set comprising 1150 messages, with 89.91% (n=1034) meeting character length requirements and 80.7% (n=928) meeting readability requirements. Furthermore, our analysis reveals that all messages exhibit diversity comparable to an existing publicly available data set created under the same theoretical framework for a similar setting.

Conclusions: This research provides a novel approach to content creation for health care interventions using state-of-the-art generative AI tools. Future research is needed to assess the generated content for ethical, safety, and research standards, as well as to determine whether the intervention is successful in improving the target behaviors.

背景：简短信息干预已在医疗保健领域展现出巨大的前景，然而，这些信息的开发却缺乏透明度，也缺少可公开获取的数据集。此外，由研究人员主导的内容创作过程也引发了资源分配问题，因此需要一种更高效、更透明的内容开发方法：本研究旨在通过展示将人工智能（AI）作为内容创建工具的使用，透明地解释提示设计和内容创建过程，并提供最大的公开简短信息数据集和源代码，以便将来复制我们的过程，从而解决短信干预内容开发所面临的挑战：利用预训练的大型语言模型 GPT-3.5 (OpenAI)，我们使用先前系统综述中确定的循证行为改变技术，为 2 型糖尿病患者生成了一系列有关坚持用药的信息。我们创建了一个归属提示，旨在遵守内容（可读性和语气）和短信（字符数和编码器类型）标准，同时鼓励信息的可变性，以反映行为改变技术的差异：结果：我们为单一的医疗保健干预措施提供了最广泛的简短信息库，并提供了首个使用生成式人工智能制作的信息库。我们的方法总共产生了包含 1150 条信息的数据集，89.91%（n=1034）的信息符合字符长度要求，80.7%（n=928）的信息符合可读性要求。此外，我们的分析表明，所有信息表现出的多样性可与在相同理论框架下为类似环境创建的现有公开数据集相媲美：这项研究为使用最先进的生成式人工智能工具创建医疗干预内容提供了一种新方法。未来的研究需要对生成的内容进行道德、安全和研究标准评估，并确定干预措施是否成功改善了目标行为。

{"title":"Behavioral Nudging With Generative AI for Content Development in SMS Health Care Interventions: Case Study.","authors":"Rachel M Harrison, Ekaterina Lapteva, Anton Bibin","doi":"10.2196/52974","DOIUrl":"10.2196/52974","url":null,"abstract":"Background: Brief message interventions have demonstrated immense promise in health care, yet the development of these messages has suffered from a dearth of transparency and a scarcity of publicly accessible data sets. Moreover, the researcher-driven content creation process has raised resource allocation issues, necessitating a more efficient and transparent approach to content development.Objective: This research sets out to address the challenges of content development for SMS interventions by showcasing the use of generative artificial intelligence (AI) as a tool for content creation, transparently explaining the prompt design and content generation process, and providing the largest publicly available data set of brief messages and source code for future replication of our process.Methods: Leveraging the pretrained large language model GPT-3.5 (OpenAI), we generate a collection of messages in the context of medication adherence for individuals with type 2 diabetes using evidence-derived behavior change techniques identified in a prior systematic review. We create an attributed prompt designed to adhere to content (readability and tone) and SMS (character count and encoder type) standards while encouraging message variability to reflect differences in behavior change techniques.Results: We deliver the most extensive repository of brief messages for a singular health care intervention and the first library of messages crafted with generative AI. In total, our method yields a data set comprising 1150 messages, with 89.91% (n=1034) meeting character length requirements and 80.7% (n=928) meeting readability requirements. Furthermore, our analysis reveals that all messages exhibit diversity comparable to an existing publicly available data set created under the same theoretical framework for a similar setting.Conclusions: This research provides a novel approach to content creation for health care interventions using state-of-the-art generative AI tools. Future research is needed to assess the generated content for ethical, safety, and research standards, as well as to determine whether the intervention is successful in improving the target behaviors.","PeriodicalId":73551,"journal":{"name":"JMIR AI","volume":"3 ","pages":"e52974"},"PeriodicalIF":0.0,"publicationDate":"2024-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11522651/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142482399","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

The Dual Nature of AI in Information Dissemination: Ethical Considerations. 人工智能在信息传播中的双重性质：伦理考虑。

JMIR AI

Pub Date : 2024-10-15 DOI: 10.2196/53505

Federico Germani, Giovanni Spitale, Nikola Biller-Andorno

Infodemics pose significant dangers to public health and to the societal fabric, as the spread of misinformation can have far-reaching consequences. While artificial intelligence (AI) systems have the potential to craft compelling and valuable information campaigns with positive repercussions for public health and democracy, concerns have arisen regarding the potential use of AI systems to generate convincing disinformation. The consequences of this dual nature of AI, capable of both illuminating and obscuring the information landscape, are complex and multifaceted. We contend that the rapid integration of AI into society demands a comprehensive understanding of its ethical implications and the development of strategies to harness its potential for the greater good while mitigating harm. Thus, in this paper we explore the ethical dimensions of AI's role in information dissemination and impact on public health, arguing that potential strategies to deal with AI and disinformation encompass generating regulated and transparent data sets used to train AI models, regulating content outputs, and promoting information literacy.

由于错误信息的传播会产生深远的后果，因此信息欺骗对公众健康和社会结构构成重大威胁。虽然人工智能（AI）系统有可能制作出令人信服和有价值的信息宣传，对公众健康和民主产生积极影响，但人们也担心人工智能系统有可能被用来制造令人信服的虚假信息。人工智能具有双重属性，既能照亮信息环境，也能掩盖信息环境，其后果是复杂和多方面的。我们认为，人工智能与社会的快速融合要求我们全面了解其伦理影响，并制定战略来利用其潜力，在减少危害的同时实现更大的利益。因此，在本文中，我们探讨了人工智能在信息传播中的作用以及对公共健康影响的伦理层面，认为应对人工智能和虚假信息的潜在策略包括生成用于训练人工智能模型的规范、透明的数据集，规范内容输出，以及促进信息扫盲。

{"title":"The Dual Nature of AI in Information Dissemination: Ethical Considerations.","authors":"Federico Germani, Giovanni Spitale, Nikola Biller-Andorno","doi":"10.2196/53505","DOIUrl":"10.2196/53505","url":null,"abstract":"Infodemics pose significant dangers to public health and to the societal fabric, as the spread of misinformation can have far-reaching consequences. While artificial intelligence (AI) systems have the potential to craft compelling and valuable information campaigns with positive repercussions for public health and democracy, concerns have arisen regarding the potential use of AI systems to generate convincing disinformation. The consequences of this dual nature of AI, capable of both illuminating and obscuring the information landscape, are complex and multifaceted. We contend that the rapid integration of AI into society demands a comprehensive understanding of its ethical implications and the development of strategies to harness its potential for the greater good while mitigating harm. Thus, in this paper we explore the ethical dimensions of AI's role in information dissemination and impact on public health, arguing that potential strategies to deal with AI and disinformation encompass generating regulated and transparent data sets used to train AI models, regulating content outputs, and promoting information literacy.","PeriodicalId":73551,"journal":{"name":"JMIR AI","volume":"3 ","pages":"e53505"},"PeriodicalIF":0.0,"publicationDate":"2024-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11522648/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142482402","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

The Utility and Implications of Ambient Scribes in Primary Care. 基层医疗保健中的常备抄写员的效用和意义。

JMIR AI

Pub Date : 2024-10-04 DOI: 10.2196/57673

Puneet Seth, Romina Carretas, Frank Rudzicz

Ambient scribe technology, utilizing large language models, represents an opportunity for addressing several current pain points in the delivery of primary care. We explore the evolution of ambient scribes and their current use in primary care. We discuss the suitability of primary care for ambient scribe integration, considering the varied nature of patient presentations and the emphasis on comprehensive care. We also propose the stages of maturation in the use of ambient scribes in primary care and their impact on care delivery. Finally, we call for focused research on safety, bias, patient impact, and privacy in ambient scribe technology, emphasizing the need for early training and education of health care providers in artificial intelligence and digital health tools.

利用大型语言模型的环境抄写员技术为解决目前初级医疗服务中的几个痛点提供了机会。我们探讨了环境抄写员的发展及其目前在初级医疗中的应用。考虑到病人表现的多样性和对综合护理的重视，我们讨论了初级医疗是否适合集成环境抄写员。我们还提出了在初级医疗中使用环境抄写员的成熟阶段及其对医疗服务的影响。最后，我们呼吁对环境抄写员技术的安全性、偏差、对患者的影响和隐私进行重点研究，强调需要对医疗服务提供者进行人工智能和数字医疗工具方面的早期培训和教育。

引用次数: 0

Leveraging Temporal Trends for Training Contextual Word Embeddings to Address Bias in Biomedical Applications: Development Study. 利用时态趋势训练上下文单词嵌入，解决生物医学应用中的偏差问题：开发研究。

JMIR AI

Pub Date : 2024-10-02 DOI: 10.2196/49546

Shunit Agmon, Uriel Singer, Kira Radinsky

Background: Women have been underrepresented in clinical trials for many years. Machine-learning models trained on clinical trial abstracts may capture and amplify biases in the data. Specifically, word embeddings are models that enable representing words as vectors and are the building block of most natural language processing systems. If word embeddings are trained on clinical trial abstracts, predictive models that use the embeddings will exhibit gender performance gaps.

Objective: We aim to capture temporal trends in clinical trials through temporal distribution matching on contextual word embeddings (specifically, BERT) and explore its effect on the bias manifested in downstream tasks.

Methods: We present TeDi-BERT, a method to harness the temporal trend of increasing women's inclusion in clinical trials to train contextual word embeddings. We implement temporal distribution matching through an adversarial classifier, trying to distinguish old from new clinical trial abstracts based on their embeddings. The temporal distribution matching acts as a form of domain adaptation from older to more recent clinical trials. We evaluate our model on 2 clinical tasks: prediction of unplanned readmission to the intensive care unit and hospital length of stay prediction. We also conduct an algorithmic analysis of the proposed method.

Results: In readmission prediction, TeDi-BERT achieved area under the receiver operating characteristic curve of 0.64 for female patients versus the baseline of 0.62 (P<.001), and 0.66 for male patients versus the baseline of 0.64 (P<.001). In the length of stay regression, TeDi-BERT achieved a mean absolute error of 4.56 (95% CI 4.44-4.68) for female patients versus 4.62 (95% CI 4.50-4.74, P<.001) and 4.54 (95% CI 4.44-4.65) for male patients versus 4.6 (95% CI 4.50-4.71, P<.001).

Conclusions: In both clinical tasks, TeDi-BERT improved performance for female patients, as expected; but it also improved performance for male patients. Our results show that accuracy for one gender does not need to be exchanged for bias reduction, but rather that good science improves clinical results for all. Contextual word embedding models trained to capture temporal trends can help mitigate the effects of bias that changes over time in the training data.

背景：多年来，女性在临床试验中的代表性一直不足。在临床试验摘要上训练的机器学习模型可能会捕捉并放大数据中的偏差。具体来说，单词嵌入是一种能将单词表示为向量的模型，是大多数自然语言处理系统的组成部分。如果在临床试验摘要中训练单词嵌入，那么使用嵌入的预测模型将表现出性别性能差距：我们旨在通过上下文词嵌入（特别是 BERT）的时间分布匹配来捕捉临床试验的时间趋势，并探索其对下游任务中表现出的偏差的影响：我们提出了 TeDi-BERT 方法，这是一种利用女性参与临床试验人数增加的时间趋势来训练上下文词嵌入的方法。我们通过对抗分类器实现时间分布匹配，试图根据嵌入词来区分新旧临床试验摘要。时间分布匹配是一种从较旧临床试验到较新临床试验的领域适应形式。我们在两项临床任务中评估了我们的模型：重症监护室意外再入院预测和住院时间预测。我们还对所提出的方法进行了算法分析：结果：在再入院预测中，TeDi-BERT 对女性患者的接收者操作特征曲线下面积为 0.64，而基线为 0.62（结论：TeDi-BERT 对女性患者的接收者操作特征曲线下面积为 0.64，而基线为 0.62）：在这两项临床任务中，TeDi-BERT都提高了女性患者的表现，这是意料之中的；但它也提高了男性患者的表现。我们的研究结果表明，不需要以减少偏差为代价来换取某一性别的准确性，良好的科学性可以改善所有性别的临床结果。为捕捉时间趋势而训练的上下文词嵌入模型有助于减轻训练数据中随时间变化的偏差的影响。

{"title":"Leveraging Temporal Trends for Training Contextual Word Embeddings to Address Bias in Biomedical Applications: Development Study.","authors":"Shunit Agmon, Uriel Singer, Kira Radinsky","doi":"10.2196/49546","DOIUrl":"10.2196/49546","url":null,"abstract":"Background: Women have been underrepresented in clinical trials for many years. Machine-learning models trained on clinical trial abstracts may capture and amplify biases in the data. Specifically, word embeddings are models that enable representing words as vectors and are the building block of most natural language processing systems. If word embeddings are trained on clinical trial abstracts, predictive models that use the embeddings will exhibit gender performance gaps.Objective: We aim to capture temporal trends in clinical trials through temporal distribution matching on contextual word embeddings (specifically, BERT) and explore its effect on the bias manifested in downstream tasks.Methods: We present TeDi-BERT, a method to harness the temporal trend of increasing women's inclusion in clinical trials to train contextual word embeddings. We implement temporal distribution matching through an adversarial classifier, trying to distinguish old from new clinical trial abstracts based on their embeddings. The temporal distribution matching acts as a form of domain adaptation from older to more recent clinical trials. We evaluate our model on 2 clinical tasks: prediction of unplanned readmission to the intensive care unit and hospital length of stay prediction. We also conduct an algorithmic analysis of the proposed method.Results: In readmission prediction, TeDi-BERT achieved area under the receiver operating characteristic curve of 0.64 for female patients versus the baseline of 0.62 (P<.001), and 0.66 for male patients versus the baseline of 0.64 (P<.001). In the length of stay regression, TeDi-BERT achieved a mean absolute error of 4.56 (95% CI 4.44-4.68) for female patients versus 4.62 (95% CI 4.50-4.74, P<.001) and 4.54 (95% CI 4.44-4.65) for male patients versus 4.6 (95% CI 4.50-4.71, P<.001).Conclusions: In both clinical tasks, TeDi-BERT improved performance for female patients, as expected; but it also improved performance for male patients. Our results show that accuracy for one gender does not need to be exchanged for bias reduction, but rather that good science improves clinical results for all. Contextual word embedding models trained to capture temporal trends can help mitigate the effects of bias that changes over time in the training data.","PeriodicalId":73551,"journal":{"name":"JMIR AI","volume":"3 ","pages":"e49546"},"PeriodicalIF":0.0,"publicationDate":"2024-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11483253/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142367742","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Impact of a Digital Scribe System on Clinical Documentation Time and Quality: Usability Study. 数字抄写系统对临床文档记录时间和质量的影响：可用性研究。

JMIR AI

Pub Date : 2024-09-23 DOI: 10.2196/60020

Marieke Meija van Buchem, Ilse M J Kant, Liza King, Jacqueline Kazmaier, Ewout W Steyerberg, Martijn P Bauer

Background: Physicians spend approximately half of their time on administrative tasks, which is one of the leading causes of physician burnout and decreased work satisfaction. The implementation of natural language processing-assisted clinical documentation tools may provide a solution.

Objective: This study investigates the impact of a commercially available Dutch digital scribe system on clinical documentation efficiency and quality.

Methods: Medical students with experience in clinical practice and documentation (n=22) created a total of 430 summaries of mock consultations and recorded the time they spent on this task. The consultations were summarized using 3 methods: manual summaries, fully automated summaries, and automated summaries with manual editing. We then randomly reassigned the summaries and evaluated their quality using a modified version of the Physician Documentation Quality Instrument (PDQI-9). We compared the differences between the 3 methods in descriptive statistics, quantitative text metrics (word count and lexical diversity), the PDQI-9, Recall-Oriented Understudy for Gisting Evaluation scores, and BERTScore.

Results: The median time for manual summarization was 202 seconds against 186 seconds for editing an automatic summary. Without editing, the automatic summaries attained a poorer PDQI-9 score than manual summaries (median PDQI-9 score 25 vs 31, P<.001, ANOVA test). Automatic summaries were found to have higher word counts but lower lexical diversity than manual summaries (P<.001, independent t test). The study revealed variable impacts on PDQI-9 scores and summarization time across individuals. Generally, students viewed the digital scribe system as a potentially useful tool, noting its ease of use and time-saving potential, though some criticized the summaries for their greater length and rigid structure.

Conclusions: This study highlights the potential of digital scribes in improving clinical documentation processes by offering a first summary draft for physicians to edit, thereby reducing documentation time without compromising the quality of patient records. Furthermore, digital scribes may be more beneficial to some physicians than to others and could play a role in improving the reusability of clinical documentation. Future studies should focus on the impact and quality of such a system when used by physicians in clinical practice.

背景：医生将大约一半的时间花在行政工作上，这是导致医生职业倦怠和工作满意度下降的主要原因之一。实施自然语言处理辅助临床文档编制工具可能是一种解决方案：本研究调查了市场上销售的荷兰数字抄写员系统对临床文档效率和质量的影响：方法：具有临床实践和文档记录经验的医科学生（22 人）共创建了 430 份模拟会诊摘要，并记录了他们在这项任务上花费的时间。会诊总结采用了 3 种方法：手动总结、全自动总结和带手动编辑的自动总结。然后，我们随机重新分配摘要，并使用修订版的医生文档质量量表（PDQI-9）对其质量进行评估。我们比较了 3 种方法在描述性统计、定量文本指标（字数和词汇多样性）、PDQI-9、以回忆为导向的摘要评估评分和 BERTScore 方面的差异：结果：人工摘要的中位时间为 202 秒，而自动摘要的编辑时间为 186 秒。在没有编辑的情况下，自动摘要的 PDQI-9 得分低于人工摘要（PDQI-9 的中位数为 25 分，而人工摘要为 31 分）：这项研究强调了数字抄写员在改善临床文档记录流程方面的潜力，它提供了第一份摘要草稿供医生编辑，从而在不影响病历质量的情况下减少了文档记录时间。此外，数字抄写员可能对某些医生比对另一些医生更有利，并能在提高临床文档的可重用性方面发挥作用。未来的研究应侧重于医生在临床实践中使用这种系统时的影响和质量。

{"title":"Impact of a Digital Scribe System on Clinical Documentation Time and Quality: Usability Study.","authors":"Marieke Meija van Buchem, Ilse M J Kant, Liza King, Jacqueline Kazmaier, Ewout W Steyerberg, Martijn P Bauer","doi":"10.2196/60020","DOIUrl":"10.2196/60020","url":null,"abstract":"Background: Physicians spend approximately half of their time on administrative tasks, which is one of the leading causes of physician burnout and decreased work satisfaction. The implementation of natural language processing-assisted clinical documentation tools may provide a solution.Objective: This study investigates the impact of a commercially available Dutch digital scribe system on clinical documentation efficiency and quality.Methods: Medical students with experience in clinical practice and documentation (n=22) created a total of 430 summaries of mock consultations and recorded the time they spent on this task. The consultations were summarized using 3 methods: manual summaries, fully automated summaries, and automated summaries with manual editing. We then randomly reassigned the summaries and evaluated their quality using a modified version of the Physician Documentation Quality Instrument (PDQI-9). We compared the differences between the 3 methods in descriptive statistics, quantitative text metrics (word count and lexical diversity), the PDQI-9, Recall-Oriented Understudy for Gisting Evaluation scores, and BERTScore.Results: The median time for manual summarization was 202 seconds against 186 seconds for editing an automatic summary. Without editing, the automatic summaries attained a poorer PDQI-9 score than manual summaries (median PDQI-9 score 25 vs 31, P<.001, ANOVA test). Automatic summaries were found to have higher word counts but lower lexical diversity than manual summaries (P<.001, independent t test). The study revealed variable impacts on PDQI-9 scores and summarization time across individuals. Generally, students viewed the digital scribe system as a potentially useful tool, noting its ease of use and time-saving potential, though some criticized the summaries for their greater length and rigid structure.Conclusions: This study highlights the potential of digital scribes in improving clinical documentation processes by offering a first summary draft for physicians to edit, thereby reducing documentation time without compromising the quality of patient records. Furthermore, digital scribes may be more beneficial to some physicians than to others and could play a role in improving the reusability of clinical documentation. Future studies should focus on the impact and quality of such a system when used by physicians in clinical practice.","PeriodicalId":73551,"journal":{"name":"JMIR AI","volume":"3 ","pages":"e60020"},"PeriodicalIF":0.0,"publicationDate":"2024-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11459111/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142302660","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Predictive Modeling of Hypertension-Related Postpartum Readmission: Retrospective Cohort Analysis. 与高血压相关的产后再入院预测模型：回顾性队列分析

JMIR AI

Pub Date : 2024-09-13 DOI: 10.2196/48588

Jinxin Tao, Ramsey G Larson, Yonatan Mintz, Oguzhan Alagoz, Kara K Hoppe

Background: Hypertension is the most common reason for postpartum hospital readmission. Better prediction of postpartum readmission will improve the health care of patients. These models will allow better use of resources and decrease health care costs.Objective: This study aimed to evaluate clinical predictors of postpartum readmission for hypertension using a novel machine learning (ML) model that can effectively predict readmissions and balance treatment costs. We examined whether blood pressure and other measures during labor, not just postpartum measures, would be important predictors of readmission.Methods: We conducted a retrospective cohort study from the PeriData website data set from a single midwestern academic center of all women who delivered from 2009 to 2018. This study consists of 2 data sets; 1 spanning the years 2009-2015 and the other spanning the years 2016-2018. A total of 47 clinical and demographic variables were collected including blood pressure measurements during labor and post partum, laboratory values, and medication administration. Hospital readmissions were verified by patient chart review. In total, 32,645 were considered in the study. For our analysis, we trained several cost-sensitive ML models to predict the primary outcome of hypertension-related postpartum readmission within 42 days post partum. Models were evaluated using cross-validation and on independent data sets (models trained on data from 2009 to 2015 were validated on the data from 2016 to 2018). To assess clinical viability, a cost analysis of the models was performed to see how their recommendations could affect treatment costs.Results: Of the 32,645 patients included in the study, 170 were readmitted due to a hypertension-related diagnosis. A cost-sensitive random forest method was found to be the most effective with a balanced accuracy of 76.61% for predicting readmission. Using a feature importance and area under the curve analysis, the most important variables for predicting readmission were blood pressures in labor and 24-48 hours post partum increasing the area under the curve of the model from 0.69 (SD 0.06) to 0.81 (SD 0.06), (P=.05). Cost analysis showed that the resulting model could have reduced associated readmission costs by US $6000 against comparable models with similar F1-score and balanced accuracy. The most effective model was then implemented as a risk calculator that is publicly available. The code for this calculator and the model is also publicly available at a GitHub repository.Conclusions: Blood pressure measurements during labor through 48 hours post partum can be combined with other variables to predict women at risk for postpartum readmission. Using ML techniques in conjunction with these data have the potential to improve health outcomes and reduce associated costs. The use of the calculator can g

背景：高血压是产后再次入院的最常见原因。更好地预测产后再入院将改善患者的医疗服务。这些模型将有助于更好地利用资源，降低医疗成本：本研究旨在使用新型机器学习（ML）模型评估产后高血压再入院的临床预测因素，该模型可有效预测再入院情况并平衡治疗成本。我们研究了分娩过程中的血压和其他测量指标，而不仅仅是产后测量指标，是否会成为再入院的重要预测因素：我们从PeriData网站的数据集中进行了一项回顾性队列研究，该数据集来自一个中西部学术中心，包含2009年至2018年期间分娩的所有产妇。该研究由两组数据组成，一组跨度为 2009-2015 年，另一组跨度为 2016-2018 年。共收集了 47 个临床和人口统计学变量，包括分娩期间和产后的血压测量值、实验室值和用药情况。通过病历审查核实了再入院情况。本研究共考虑了 32,645 例患者。在分析过程中，我们训练了多个成本敏感的 ML 模型，以预测产后 42 天内与高血压相关的产后再入院这一主要结果。我们使用交叉验证并在独立数据集上对模型进行了评估（在 2009 年至 2015 年的数据上训练的模型在 2016 年至 2018 年的数据上进行了验证）。为评估临床可行性，对模型进行了成本分析，以了解其建议会如何影响治疗成本：在纳入研究的 32,645 名患者中，有 170 人因高血压相关诊断而再次入院。研究发现，对成本敏感的随机森林方法最有效，预测再入院的平衡准确率为 76.61%。通过特征重要性和曲线下面积分析，预测再入院的最重要变量是分娩时和产后 24-48 小时的血压，使模型的曲线下面积从 0.69（标清 0.06）增加到 0.81（标清 0.06），（P=0.05）。成本分析表明，与具有相似 F1 分数和均衡准确性的可比模型相比，该模型可将相关再入院成本降低 6000 美元。最有效的模型随后作为风险计算器被公开使用。该计算器和模型的代码也在 GitHub 存储库中公开发布：结论：分娩过程中到产后 48 小时内的血压测量值可与其他变量相结合，预测产妇产后再入院的风险。结合这些数据使用 ML 技术有可能改善健康结果并降低相关成本。计算器的使用可以极大地帮助临床医生为患者提供护理服务，并改善医疗决策。

{"title":"Predictive Modeling of Hypertension-Related Postpartum Readmission: Retrospective Cohort Analysis.","authors":"Jinxin Tao, Ramsey G Larson, Yonatan Mintz, Oguzhan Alagoz, Kara K Hoppe","doi":"10.2196/48588","DOIUrl":"10.2196/48588","url":null,"abstract":"Background: Hypertension is the most common reason for postpartum hospital readmission. Better prediction of postpartum readmission will improve the health care of patients. These models will allow better use of resources and decrease health care costs.Objective: This study aimed to evaluate clinical predictors of postpartum readmission for hypertension using a novel machine learning (ML) model that can effectively predict readmissions and balance treatment costs. We examined whether blood pressure and other measures during labor, not just postpartum measures, would be important predictors of readmission.Methods: We conducted a retrospective cohort study from the PeriData website data set from a single midwestern academic center of all women who delivered from 2009 to 2018. This study consists of 2 data sets; 1 spanning the years 2009-2015 and the other spanning the years 2016-2018. A total of 47 clinical and demographic variables were collected including blood pressure measurements during labor and post partum, laboratory values, and medication administration. Hospital readmissions were verified by patient chart review. In total, 32,645 were considered in the study. For our analysis, we trained several cost-sensitive ML models to predict the primary outcome of hypertension-related postpartum readmission within 42 days post partum. Models were evaluated using cross-validation and on independent data sets (models trained on data from 2009 to 2015 were validated on the data from 2016 to 2018). To assess clinical viability, a cost analysis of the models was performed to see how their recommendations could affect treatment costs.Results: Of the 32,645 patients included in the study, 170 were readmitted due to a hypertension-related diagnosis. A cost-sensitive random forest method was found to be the most effective with a balanced accuracy of 76.61% for predicting readmission. Using a feature importance and area under the curve analysis, the most important variables for predicting readmission were blood pressures in labor and 24-48 hours post partum increasing the area under the curve of the model from 0.69 (SD 0.06) to 0.81 (SD 0.06), (P=.05). Cost analysis showed that the resulting model could have reduced associated readmission costs by US $6000 against comparable models with similar F1-score and balanced accuracy. The most effective model was then implemented as a risk calculator that is publicly available. The code for this calculator and the model is also publicly available at a GitHub repository.Conclusions: Blood pressure measurements during labor through 48 hours post partum can be combined with other variables to predict women at risk for postpartum readmission. Using ML techniques in conjunction with these data have the potential to improve health outcomes and reduce associated costs. The use of the calculator can g","PeriodicalId":73551,"journal":{"name":"JMIR AI","volume":"3 ","pages":"e48588"},"PeriodicalIF":0.0,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11437324/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142302661","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Development of Lung Cancer Risk Prediction Machine Learning Models for Equitable Learning Health System: Retrospective Study. 开发肺癌风险预测机器学习模型，促进公平学习的医疗系统：回顾性研究。

JMIR AI

Pub Date : 2024-09-11 DOI: 10.2196/56590

Anjun Chen, Erman Wu, Ran Huang, Bairong Shen, Ruobing Han, Jian Wen, Zhiyong Zhang, Qinghua Li

Background: A significant proportion of young at-risk patients and nonsmokers are excluded by the current guidelines for lung cancer (LC) screening, resulting in low-screening adoption. The vision of the US National Academy of Medicine to transform health systems into learning health systems (LHS) holds promise for bringing necessary structural changes to health care, thereby addressing the exclusivity and adoption issues of LC screening.

Objective: This study aims to realize the LHS vision by designing an equitable, machine learning (ML)-enabled LHS unit for LC screening. It focuses on developing an inclusive and practical LC risk prediction model, suitable for initializing the ML-enabled LHS (ML-LHS) unit. This model aims to empower primary physicians in a clinical research network, linking central hospitals and rural clinics, to routinely deliver risk-based screening for enhancing LC early detection in broader populations.

Methods: We created a standardized data set of health factors from 1397 patients with LC and 1448 control patients, all aged 30 years and older, including both smokers and nonsmokers, from a hospital's electronic medical record system. Initially, a data-centric ML approach was used to create inclusive ML models for risk prediction from all available health factors. Subsequently, a quantitative distribution of LC health factors was used in feature engineering to refine the models into a more practical model with fewer variables.

Results: The initial inclusive 250-variable XGBoost model for LC risk prediction achieved performance metrics of 0.86 recall, 0.90 precision, and 0.89 accuracy. Post feature refinement, a practical 29-variable XGBoost model was developed, displaying performance metrics of 0.80 recall, 0.82 precision, and 0.82 accuracy. This model met the criteria for initializing the ML-LHS unit for risk-based, inclusive LC screening within clinical research networks.

Conclusions: This study designed an innovative ML-LHS unit for a clinical research network, aiming to sustainably provide inclusive LC screening to all at-risk populations. It developed an inclusive and practical XGBoost model from hospital electronic medical record data, capable of initializing such an ML-LHS unit for community and rural clinics. The anticipated deployment of this ML-LHS unit is expected to significantly improve LC-screening rates and early detection among broader populations, including those typically overlooked by existing screening guidelines.

背景：目前的肺癌筛查指南将很大一部分年轻的高危患者和非吸烟者排除在外，导致筛查的采用率很低。美国国家医学院提出的将医疗系统转变为学习型医疗系统（LHS）的愿景有望为医疗保健带来必要的结构性变化，从而解决肺癌筛查的排他性和采用率问题：本研究旨在通过为低血糖筛查设计一个公平的、由机器学习（ML）支持的 LHS 单元来实现 LHS 愿景。本研究的重点是开发一个包容性强且实用的低血糖风险预测模型，该模型适用于初始化支持机器学习的低血糖筛查系统（ML-LHS）。该模型旨在增强临床研究网络中基层医生的能力，将中心医院和农村诊所联系起来，定期提供基于风险的筛查，以提高更多人群的乳腺癌早期发现率：我们从一家医院的电子病历系统中创建了一个标准化的健康因素数据集，这些数据来自 1397 名 LC 患者和 1448 名对照组患者，年龄均在 30 岁及以上，包括吸烟者和非吸烟者。最初，研究人员采用了以数据为中心的 ML 方法，从所有可用的健康因素中创建了用于风险预测的包容性 ML 模型。随后，在特征工程中使用了LC健康因素的定量分布，将模型改进为变量更少、更实用的模型：结果：用于 LC 风险预测的初始包含 250 个变量的 XGBoost 模型的召回率为 0.86，精确率为 0.90，准确率为 0.89。经过特征改进后，开发出了一个实用的 29 变量 XGBoost 模型，其召回率为 0.80，精确率为 0.82，准确率为 0.82。该模型符合在临床研究网络中为基于风险的包容性低血糖筛查初始化 ML-LHS 单元的标准：本研究为临床研究网络设计了一个创新的 ML-LHS 单元，旨在为所有高危人群提供可持续的包容性低血糖筛查。该研究从医院电子病历数据中开发了一个包容性和实用的 XGBoost 模型，能够为社区和农村诊所初始化这样一个 ML-LHS 单元。预计该 ML-LHS 设备的部署将显著提高更广泛人群的 LC 筛查率和早期发现率，包括那些通常被现有筛查指南忽视的人群。

{"title":"Development of Lung Cancer Risk Prediction Machine Learning Models for Equitable Learning Health System: Retrospective Study.","authors":"Anjun Chen, Erman Wu, Ran Huang, Bairong Shen, Ruobing Han, Jian Wen, Zhiyong Zhang, Qinghua Li","doi":"10.2196/56590","DOIUrl":"10.2196/56590","url":null,"abstract":"Background: A significant proportion of young at-risk patients and nonsmokers are excluded by the current guidelines for lung cancer (LC) screening, resulting in low-screening adoption. The vision of the US National Academy of Medicine to transform health systems into learning health systems (LHS) holds promise for bringing necessary structural changes to health care, thereby addressing the exclusivity and adoption issues of LC screening.Objective: This study aims to realize the LHS vision by designing an equitable, machine learning (ML)-enabled LHS unit for LC screening. It focuses on developing an inclusive and practical LC risk prediction model, suitable for initializing the ML-enabled LHS (ML-LHS) unit. This model aims to empower primary physicians in a clinical research network, linking central hospitals and rural clinics, to routinely deliver risk-based screening for enhancing LC early detection in broader populations.Methods: We created a standardized data set of health factors from 1397 patients with LC and 1448 control patients, all aged 30 years and older, including both smokers and nonsmokers, from a hospital's electronic medical record system. Initially, a data-centric ML approach was used to create inclusive ML models for risk prediction from all available health factors. Subsequently, a quantitative distribution of LC health factors was used in feature engineering to refine the models into a more practical model with fewer variables.Results: The initial inclusive 250-variable XGBoost model for LC risk prediction achieved performance metrics of 0.86 recall, 0.90 precision, and 0.89 accuracy. Post feature refinement, a practical 29-variable XGBoost model was developed, displaying performance metrics of 0.80 recall, 0.82 precision, and 0.82 accuracy. This model met the criteria for initializing the ML-LHS unit for risk-based, inclusive LC screening within clinical research networks.Conclusions: This study designed an innovative ML-LHS unit for a clinical research network, aiming to sustainably provide inclusive LC screening to all at-risk populations. It developed an inclusive and practical XGBoost model from hospital electronic medical record data, capable of initializing such an ML-LHS unit for community and rural clinics. The anticipated deployment of this ML-LHS unit is expected to significantly improve LC-screening rates and early detection among broader populations, including those typically overlooked by existing screening guidelines.","PeriodicalId":73551,"journal":{"name":"JMIR AI","volume":"3 ","pages":"e56590"},"PeriodicalIF":0.0,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11425024/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142302659","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

JMIR AI最新文献