JMIR AI

Pub Date : 2024-04-22 DOI: 10.2196/52615

Chao Yan, Ziqi Zhang, Steve Nyemba, Zhuohang Li

Synthetic electronic health record (EHR) data generation has been increasingly recognized as an important solution to expand the accessibility and maximize the value of private health data on a large scale. Recent advances in machine learning have facilitated more accurate modeling for complex and high-dimensional data, thereby greatly enhancing the data quality of synthetic EHR data. Among various approaches, generative adversarial networks (GANs) have become the main technical path in the literature due to their ability to capture the statistical characteristics of real data. However, there is a scarcity of detailed guidance within the domain regarding the development procedures of synthetic EHR data. The objective of this tutorial is to present a transparent and reproducible process for generating structured synthetic EHR data using a publicly accessible EHR data set as an example. We cover the topics of GAN architecture, EHR data types and representation, data preprocessing, GAN training, synthetic data generation and postprocessing, and data quality evaluation. We conclude this tutorial by discussing multiple important issues and future opportunities in this domain. The source code of the entire process has been made publicly available.

合成电子健康记录（EHR）数据生成已被越来越多的人认为是扩大私人健康数据的可访问性并最大限度地提高其价值的重要解决方案。机器学习的最新进展促进了对复杂和高维数据进行更精确的建模，从而大大提高了合成电子病历数据的质量。在各种方法中，生成对抗网络（GANs）因其能够捕捉真实数据的统计特征而成为文献中的主要技术路径。然而，在该领域中，有关合成电子病历数据开发程序的详细指导却十分匮乏。本教程的目的是以公开的电子病历数据集为例，介绍生成结构化合成电子病历数据的透明且可重复的流程。我们将讨论 GAN 架构、电子病历数据类型和表示、数据预处理、GAN 训练、合成数据生成和后处理以及数据质量评估等主题。最后，我们将讨论该领域的多个重要问题和未来机遇。整个过程的源代码已经公开。

{"title":"Generating Synthetic Electronic Health Record Data Using Generative Adversarial Networks: Tutorial.","authors":"Chao Yan, Ziqi Zhang, Steve Nyemba, Zhuohang Li","doi":"10.2196/52615","DOIUrl":"10.2196/52615","url":null,"abstract":"Synthetic electronic health record (EHR) data generation has been increasingly recognized as an important solution to expand the accessibility and maximize the value of private health data on a large scale. Recent advances in machine learning have facilitated more accurate modeling for complex and high-dimensional data, thereby greatly enhancing the data quality of synthetic EHR data. Among various approaches, generative adversarial networks (GANs) have become the main technical path in the literature due to their ability to capture the statistical characteristics of real data. However, there is a scarcity of detailed guidance within the domain regarding the development procedures of synthetic EHR data. The objective of this tutorial is to present a transparent and reproducible process for generating structured synthetic EHR data using a publicly accessible EHR data set as an example. We cover the topics of GAN architecture, EHR data types and representation, data preprocessing, GAN training, synthetic data generation and postprocessing, and data quality evaluation. We conclude this tutorial by discussing multiple important issues and future opportunities in this domain. The source code of the entire process has been made publicly available.","PeriodicalId":73551,"journal":{"name":"JMIR AI","volume":"3 ","pages":"e52615"},"PeriodicalIF":0.0,"publicationDate":"2024-04-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11074891/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141322102","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Identifying Links Between Productivity and Biobehavioral Rhythms Modeled From Multimodal Sensor Streams: Exploratory Quantitative Study. 从多模态传感器流中确定生产力与生物行为节律之间的联系：探索性定量研究。

JMIR AI

Pub Date : 2024-04-18 DOI: 10.2196/47194

Runze Yan, Xinwen Liu, Janine M Dutcher, Michael J Tumminia, Daniella Villalba, Sheldon Cohen, John D Creswell, Kasey Creswell, Jennifer Mankoff, Anind K Dey, Afsaneh Doryab

Background: Biobehavioral rhythms are biological, behavioral, and psychosocial processes with repeating cycles. Abnormal rhythms have been linked to various health issues, such as sleep disorders, obesity, and depression.

Objective: This study aims to identify links between productivity and biobehavioral rhythms modeled from passively collected mobile data streams.

Methods: In this study, we used a multimodal mobile sensing data set consisting of data collected from smartphones and Fitbits worn by 188 college students over a continuous period of 16 weeks. The participants reported their self-evaluated daily productivity score (ranging from 0 to 4) during weeks 1, 6, and 15. To analyze the data, we modeled cyclic human behavior patterns based on multimodal mobile sensing data gathered during weeks 1, 6, 15, and the adjacent weeks. Our methodology resulted in the creation of a rhythm model for each sensor feature. Additionally, we developed a correlation-based approach to identify connections between rhythm stability and high or low productivity levels.

Results: Differences exist in the biobehavioral rhythms of high- and low-productivity students, with those demonstrating greater rhythm stability also exhibiting higher productivity levels. Notably, a negative correlation (C=-0.16) was observed between productivity and the SE of the phase for the 24-hour period during week 1, with a higher SE indicative of lower rhythm stability.

Conclusions: Modeling biobehavioral rhythms has the potential to quantify and forecast productivity. The findings have implications for building novel cyber-human systems that align with human beings' biobehavioral rhythms to improve health, well-being, and work performance.

背景：生物行为节律是具有重复周期的生物、行为和社会心理过程。异常节律与睡眠障碍、肥胖和抑郁等各种健康问题有关：本研究旨在从被动收集的移动数据流中找出生产力与生物行为节律之间的联系：在这项研究中，我们使用了一个多模态移动传感数据集，该数据集由 188 名大学生在连续 16 周内佩戴的智能手机和 Fitbits 收集的数据组成。在第 1、6 和 15 周期间，参与者报告了他们自我评估的每日工作效率得分（从 0 到 4 分不等）。为了分析这些数据，我们根据在第 1、6、15 周和相邻几周收集到的多模态移动传感数据，对人类的周期性行为模式进行了建模。我们的方法为每个传感器特征创建了一个节奏模型。此外，我们还开发了一种基于相关性的方法，以识别节奏稳定性与生产率高低之间的联系：结果：高生产力和低生产力学生的生物行为节奏存在差异，节奏稳定性更强的学生生产力水平也更高。值得注意的是，在第1周的24小时内，生产率与相位SE之间存在负相关（C=-0.16），SE越高，节奏稳定性越低：结论：生物行为节律建模具有量化和预测生产率的潜力。这些发现对建立新型网络人机系统具有重要意义，该系统可根据人类的生物行为节律改善健康、福祉和工作表现。

{"title":"Identifying Links Between Productivity and Biobehavioral Rhythms Modeled From Multimodal Sensor Streams: Exploratory Quantitative Study.","authors":"Runze Yan, Xinwen Liu, Janine M Dutcher, Michael J Tumminia, Daniella Villalba, Sheldon Cohen, John D Creswell, Kasey Creswell, Jennifer Mankoff, Anind K Dey, Afsaneh Doryab","doi":"10.2196/47194","DOIUrl":"10.2196/47194","url":null,"abstract":"Background: Biobehavioral rhythms are biological, behavioral, and psychosocial processes with repeating cycles. Abnormal rhythms have been linked to various health issues, such as sleep disorders, obesity, and depression.Objective: This study aims to identify links between productivity and biobehavioral rhythms modeled from passively collected mobile data streams.Methods: In this study, we used a multimodal mobile sensing data set consisting of data collected from smartphones and Fitbits worn by 188 college students over a continuous period of 16 weeks. The participants reported their self-evaluated daily productivity score (ranging from 0 to 4) during weeks 1, 6, and 15. To analyze the data, we modeled cyclic human behavior patterns based on multimodal mobile sensing data gathered during weeks 1, 6, 15, and the adjacent weeks. Our methodology resulted in the creation of a rhythm model for each sensor feature. Additionally, we developed a correlation-based approach to identify connections between rhythm stability and high or low productivity levels.Results: Differences exist in the biobehavioral rhythms of high- and low-productivity students, with those demonstrating greater rhythm stability also exhibiting higher productivity levels. Notably, a negative correlation (C=-0.16) was observed between productivity and the SE of the phase for the 24-hour period during week 1, with a higher SE indicative of lower rhythm stability.Conclusions: Modeling biobehavioral rhythms has the potential to quantify and forecast productivity. The findings have implications for building novel cyber-human systems that align with human beings' biobehavioral rhythms to improve health, well-being, and work performance.","PeriodicalId":73551,"journal":{"name":"JMIR AI","volume":"3 ","pages":"e47194"},"PeriodicalIF":0.0,"publicationDate":"2024-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11066747/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141322104","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Perceptions of Family Physicians About Applying AI in Primary Health Care: Case Study From a Premier Health Care Organization. 家庭医生对在初级医疗保健中应用人工智能的看法：一家顶级医疗机构的案例研究。

JMIR AI

Pub Date : 2024-04-17 DOI: 10.2196/40781

Muhammad Atif Waheed, Lu Liu

Background: The COVID-19 pandemic has led to the rapid proliferation of artificial intelligence (AI), which was not previously anticipated; this is an unforeseen development. The use of AI in health care settings is increasing, as it proves to be a promising tool for transforming health care systems, improving operational and business processes, and efficiently simplifying health care tasks for family physicians and health care administrators. Therefore, it is necessary to assess the perspective of family physicians on AI and its impact on their job roles.

Objective: This study aims to determine the impact of AI on the management and practices of Qatar's Primary Health Care Corporation (PHCC) in improving health care tasks and service delivery. Furthermore, it seeks to evaluate the impact of AI on family physicians' job roles, including associated risks and ethical ramifications from their perspective.

Methods: We conducted a cross-sectional survey and sent a web-based questionnaire survey link to 724 practicing family physicians at the PHCC. In total, we received 102 eligible responses.

Results: Of the 102 respondents, 72 (70.6%) were men and 94 (92.2%) were aged between 35 and 54 years. In addition, 58 (56.9%) of the 102 respondents were consultants. The overall awareness of AI was 80 (78.4%) out of 102, with no difference between gender (P=.06) and age groups (P=.12). AI is perceived to play a positive role in improving health care practices at PHCC (P<.001), managing health care tasks (P<.001), and positively impacting health care service delivery (P<.001). Family physicians also perceived that their clinical, administrative, and opportunistic health care management roles were positively influenced by AI (P<.001). Furthermore, perceptions of family physicians indicate that AI improves operational and human resource management (P<.001), does not undermine patient-physician relationships (P<.001), and is not considered superior to human physicians in the clinical judgment process (P<.001). However, its inclusion is believed to decrease patient satisfaction (P<.001). AI decision-making and accountability were recognized as ethical risks, along with data protection and confidentiality. The optimism regarding using AI for future medical decisions was low among family physicians.

Conclusions: This study indicated a positive perception among family physicians regarding AI integration into primary care settings. AI demonstrates significant potential for enhancing health care task management and overall service delivery at the PHCC. It augments family physicians' roles without replacing them and proves beneficial for operational efficiency, human resource management, and public health during pandemics. While the implementation of AI is anticipated to bring benefits, the careful consideration of ethical, privacy, confidentiality, and patient-

背景：COVID-19 大流行导致了人工智能（AI）的迅速普及，这是以前没有预料到的；这是一个不可预见的发展。人工智能在医疗保健领域的应用正在不断增加，因为它被证明是改造医疗保健系统、改善操作和业务流程以及有效简化家庭医生和医疗保健管理者的医疗保健任务的一种前景广阔的工具。因此，有必要评估家庭医生对人工智能的看法以及人工智能对其工作角色的影响：本研究旨在确定人工智能对卡塔尔初级医疗保健公司（PHCC）管理和实践的影响，以改进医疗保健任务和服务的提供。此外，本研究还试图评估人工智能对家庭医生工作角色的影响，包括从他们的角度看相关风险和伦理后果：我们进行了一项横断面调查，并向 724 名初级保健中心的执业家庭医生发送了网络问卷调查链接。我们共收到 102 份符合条件的回复：在 102 位受访者中，72 位（70.6%）为男性，94 位（92.2%）年龄在 35-54 岁之间。此外，102 位受访者中有 58 位（56.9%）是顾问。在 102 名受访者中，80 人（78.4%）对人工智能有总体认识，性别（P=.06）和年龄组（P=.12）之间没有差异。人们认为人工智能在改善初级保健中心的医疗保健实践方面发挥了积极作用（结论：人工智能在初级保健中心的医疗保健实践中发挥了积极作用：这项研究表明，家庭医生对将人工智能融入初级医疗机构有积极的看法。在加强初级保健中心的保健任务管理和整体服务提供方面，人工智能显示出巨大的潜力。它增强了家庭医生的作用，而不会取代他们，并证明有利于提高运营效率、人力资源管理和大流行病期间的公共卫生。虽然人工智能的实施预计会带来好处，但必须认真考虑道德、隐私、保密和以患者为中心等问题。这些见解为将人工智能战略性地融入医疗保健系统提供了宝贵的指导，重点是保持高质量的患者护理，并应对这一变革过程中出现的多方面挑战。

{"title":"Perceptions of Family Physicians About Applying AI in Primary Health Care: Case Study From a Premier Health Care Organization.","authors":"Muhammad Atif Waheed, Lu Liu","doi":"10.2196/40781","DOIUrl":"10.2196/40781","url":null,"abstract":"Background: The COVID-19 pandemic has led to the rapid proliferation of artificial intelligence (AI), which was not previously anticipated; this is an unforeseen development. The use of AI in health care settings is increasing, as it proves to be a promising tool for transforming health care systems, improving operational and business processes, and efficiently simplifying health care tasks for family physicians and health care administrators. Therefore, it is necessary to assess the perspective of family physicians on AI and its impact on their job roles.Objective: This study aims to determine the impact of AI on the management and practices of Qatar's Primary Health Care Corporation (PHCC) in improving health care tasks and service delivery. Furthermore, it seeks to evaluate the impact of AI on family physicians' job roles, including associated risks and ethical ramifications from their perspective.Methods: We conducted a cross-sectional survey and sent a web-based questionnaire survey link to 724 practicing family physicians at the PHCC. In total, we received 102 eligible responses.Results: Of the 102 respondents, 72 (70.6%) were men and 94 (92.2%) were aged between 35 and 54 years. In addition, 58 (56.9%) of the 102 respondents were consultants. The overall awareness of AI was 80 (78.4%) out of 102, with no difference between gender (P=.06) and age groups (P=.12). AI is perceived to play a positive role in improving health care practices at PHCC (P<.001), managing health care tasks (P<.001), and positively impacting health care service delivery (P<.001). Family physicians also perceived that their clinical, administrative, and opportunistic health care management roles were positively influenced by AI (P<.001). Furthermore, perceptions of family physicians indicate that AI improves operational and human resource management (P<.001), does not undermine patient-physician relationships (P<.001), and is not considered superior to human physicians in the clinical judgment process (P<.001). However, its inclusion is believed to decrease patient satisfaction (P<.001). AI decision-making and accountability were recognized as ethical risks, along with data protection and confidentiality. The optimism regarding using AI for future medical decisions was low among family physicians.Conclusions: This study indicated a positive perception among family physicians regarding AI integration into primary care settings. AI demonstrates significant potential for enhancing health care task management and overall service delivery at the PHCC. It augments family physicians' roles without replacing them and proves beneficial for operational efficiency, human resource management, and public health during pandemics. While the implementation of AI is anticipated to bring benefits, the careful consideration of ethical, privacy, confidentiality, and patient-","PeriodicalId":73551,"journal":{"name":"JMIR AI","volume":"3 ","pages":"e40781"},"PeriodicalIF":0.0,"publicationDate":"2024-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11063883/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141322055","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Privacy-Preserving Federated Survival Support Vector Machines for Cross-Institutional Time-To-Event Analysis: Algorithm Development and Validation 用于跨机构事件时间分析的隐私保护联合生存支持向量机：算法开发与验证

JMIR AI

Pub Date : 2024-03-29 DOI: 10.2196/47652

Julian Späth, Zeno Sewald, Niklas Probul, M. Berland, Mathieu Almeida, Nicolas Pons, E. Le Chatelier, Pere Ginès, C. Solé, A. Juanola, J. Pauling, Jan Baumbach

Central collection of distributed medical patient data is problematic due to strict privacy regulations. Especially in clinical environments, such as clinical time-to-event studies, large sample sizes are critical but usually not available at a single institution. It has been shown recently that federated learning, combined with privacy-enhancing technologies, is an excellent and privacy-preserving alternative to data sharing. This study aims to develop and validate a privacy-preserving, federated survival support vector machine (SVM) and make it accessible for researchers to perform cross-institutional time-to-event analyses. We extended the survival SVM algorithm to be applicable in federated environments. We further implemented it as a FeatureCloud app, enabling it to run in the federated infrastructure provided by the FeatureCloud platform. Finally, we evaluated our algorithm on 3 benchmark data sets, a large sample size synthetic data set, and a real-world microbiome data set and compared the results to the corresponding central method. Our federated survival SVM produces highly similar results to the centralized model on all data sets. The maximal difference between the model weights of the central model and the federated model was only 0.001, and the mean difference over all data sets was 0.0002. We further show that by including more data in the analysis through federated learning, predictions are more accurate even in the presence of site-dependent batch effects. The federated survival SVM extends the palette of federated time-to-event analysis methods by a robust machine learning approach. To our knowledge, the implemented FeatureCloud app is the first publicly available implementation of a federated survival SVM, is freely accessible for all kinds of researchers, and can be directly used within the FeatureCloud platform.

由于严格的隐私法规，集中收集分布式医疗患者数据很成问题。特别是在临床环境中，如临床时间到事件研究，大样本量至关重要，但通常无法在单一机构中获得。最近的研究表明，联合学习与隐私增强技术相结合，是数据共享的一个极佳且能保护隐私的替代方案。本研究旨在开发和验证一种保护隐私的联合生存支持向量机（SVM），并使研究人员能够使用它进行跨机构时间到事件分析。我们扩展了生存支持向量机算法，使其适用于联合环境。我们进一步将其作为 FeatureCloud 应用程序实现，使其能够在 FeatureCloud 平台提供的联合基础设施中运行。最后，我们在 3 个基准数据集、一个大样本量合成数据集和一个真实世界微生物组数据集上评估了我们的算法，并将结果与相应的中央方法进行了比较。在所有数据集上，我们的联合生存 SVM 得出的结果与集中模型高度相似。中央模型和联合模型的模型权重之间的最大差异仅为 0.001，所有数据集的平均差异为 0.0002。我们进一步证明，通过联合学习将更多数据纳入分析，即使存在依赖于地点的批次效应，预测也会更加准确。联合生存 SVM 通过一种稳健的机器学习方法扩展了联合时间到事件分析方法的范围。据我们所知，FeatureCloud 应用程序是第一个公开可用的联合生存 SVM 实现，可供各类研究人员免费访问，并可在 FeatureCloud 平台上直接使用。

{"title":"Privacy-Preserving Federated Survival Support Vector Machines for Cross-Institutional Time-To-Event Analysis: Algorithm Development and Validation","authors":"Julian Späth, Zeno Sewald, Niklas Probul, M. Berland, Mathieu Almeida, Nicolas Pons, E. Le Chatelier, Pere Ginès, C. Solé, A. Juanola, J. Pauling, Jan Baumbach","doi":"10.2196/47652","DOIUrl":"https://doi.org/10.2196/47652","url":null,"abstract":"\u0000 \u0000 Central collection of distributed medical patient data is problematic due to strict privacy regulations. Especially in clinical environments, such as clinical time-to-event studies, large sample sizes are critical but usually not available at a single institution. It has been shown recently that federated learning, combined with privacy-enhancing technologies, is an excellent and privacy-preserving alternative to data sharing.\u0000 \u0000 \u0000 \u0000 This study aims to develop and validate a privacy-preserving, federated survival support vector machine (SVM) and make it accessible for researchers to perform cross-institutional time-to-event analyses.\u0000 \u0000 \u0000 \u0000 We extended the survival SVM algorithm to be applicable in federated environments. We further implemented it as a FeatureCloud app, enabling it to run in the federated infrastructure provided by the FeatureCloud platform. Finally, we evaluated our algorithm on 3 benchmark data sets, a large sample size synthetic data set, and a real-world microbiome data set and compared the results to the corresponding central method.\u0000 \u0000 \u0000 \u0000 Our federated survival SVM produces highly similar results to the centralized model on all data sets. The maximal difference between the model weights of the central model and the federated model was only 0.001, and the mean difference over all data sets was 0.0002. We further show that by including more data in the analysis through federated learning, predictions are more accurate even in the presence of site-dependent batch effects.\u0000 \u0000 \u0000 \u0000 The federated survival SVM extends the palette of federated time-to-event analysis methods by a robust machine learning approach. To our knowledge, the implemented FeatureCloud app is the first publicly available implementation of a federated survival SVM, is freely accessible for all kinds of researchers, and can be directly used within the FeatureCloud platform.\u0000","PeriodicalId":73551,"journal":{"name":"JMIR AI","volume":"69 8","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-03-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140366409","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Reidentification of Participants in Shared Clinical Data Sets: Experimental Study. 共享临床数据集中参与者身份的再识别：实验研究

JMIR AI

Pub Date : 2024-03-15 DOI: 10.2196/52054

Daniela Wiepert, Bradley A Malin, Joseph R Duffy, Rene L Utianski, John L Stricker, David T Jones, Hugo Botha

Background: Large curated data sets are required to leverage speech-based tools in health care. These are costly to produce, resulting in increased interest in data sharing. As speech can potentially identify speakers (ie, voiceprints), sharing recordings raises privacy concerns. This is especially relevant when working with patient data protected under the Health Insurance Portability and Accountability Act.

Objective: We aimed to determine the reidentification risk for speech recordings, without reference to demographics or metadata, in clinical data sets considering both the size of the search space (ie, the number of comparisons that must be considered when reidentifying) and the nature of the speech recording (ie, the type of speech task).

Methods: Using a state-of-the-art speaker identification model, we modeled an adversarial attack scenario in which an adversary uses a large data set of identified speech (hereafter, the known set) to reidentify as many unknown speakers in a shared data set (hereafter, the unknown set) as possible. We first considered the effect of search space size by attempting reidentification with various sizes of known and unknown sets using VoxCeleb, a data set with recordings of natural, connected speech from >7000 healthy speakers. We then repeated these tests with different types of recordings in each set to examine whether the nature of a speech recording influences reidentification risk. For these tests, we used our clinical data set composed of recordings of elicited speech tasks from 941 speakers.

Results: We found that the risk was inversely related to the number of comparisons an adversary must consider (ie, the search space), with a positive linear correlation between the number of false acceptances (FAs) and the number of comparisons (r=0.69; P<.001). The true acceptances (TAs) stayed relatively stable, and the ratio between FAs and TAs rose from 0.02 at 1 × 10⁵ comparisons to 1.41 at 6 × 10⁶ comparisons, with a near 1:1 ratio at the midpoint of 3 × 10⁶ comparisons. In effect, risk was high for a small search space but dropped as the search space grew. We also found that the nature of a speech recording influenced reidentification risk, with nonconnected speech (eg, vowel prolongation: FA/TA=98.5; alternating motion rate: FA/TA=8) being harder to identify than connected speech (eg, sentence repetition: FA/TA=0.54) in cross-task conditions. The inverse was mostly true in within-task conditions, with the FA/TA ratio for vowel prolongation and alternating motion rate dropping to 0.39 and 1.17, respectively.

Conclusions: Our findings suggest that speaker identification models can be used to reidentify participants in specific circumstances, but in practice, the reidentification risk appears small. The variation in risk due to search space size and type of speech task

背景：要在医疗保健中利用基于语音的工具，需要大量经过整理的数据集。这些数据集的制作成本很高，因此人们对数据共享越来越感兴趣。由于语音有可能识别说话者（即声纹），共享录音会引发隐私问题。在处理受《健康保险可携性和责任法案》保护的患者数据时，这一点尤为重要：我们旨在确定临床数据集中语音录音的再识别风险，在不参考人口统计学或元数据的情况下，同时考虑搜索空间的大小（即再识别时必须考虑的比较次数）和语音录音的性质（即语音任务的类型）：我们使用最先进的扬声器识别模型，模拟了一个对抗性攻击场景，在该场景中，对抗者使用已识别语音的大型数据集（以下简称已知集），尽可能多地重新识别共享数据集（以下简称未知集）中的未知扬声器。我们首先考虑了搜索空间大小的影响，使用 VoxCeleb 尝试使用不同大小的已知集和未知集进行重新识别，VoxCeleb 是一个数据集，包含来自超过 7000 名健康说话者的自然、有关联的语音录音。然后，我们在每个数据集中使用不同类型的录音重复这些测试，以检验语音录音的性质是否会影响再识别风险。在这些测试中，我们使用了临床数据集，该数据集由 941 位发言人的诱导性语音任务录音组成：我们发现，风险与对手必须考虑的比较次数（即搜索空间）成反比，错误接受（FA）次数与比较次数之间呈正线性相关（r=0.69；P5 比较次数为 6 × 106 时为 1.41，3 × 106 比较次数的中点时比率接近 1:1）。实际上，在搜索空间较小的情况下，风险较高，但随着搜索空间的扩大，风险下降。我们还发现，在跨任务条件下，非连接语音（如元音延长：FA/TA=98.5；交替运动速率：FA/TA=8）比连接语音（如句子重复：FA/TA=0.54）更难识别。在任务内条件下，情况则基本相反，元音延长和交替运动速率的 FA/TA 比值分别降至 0.39 和 1.17：我们的研究结果表明，在特定情况下，说话者识别模型可用于重新识别参与者，但在实践中，重新识别的风险似乎很小。搜索空间大小和语音任务类型导致的风险变化为进一步提高参与者隐私提供了可行的建议，也为公开发布语音录音的政策提供了考虑因素。

{"title":"Reidentification of Participants in Shared Clinical Data Sets: Experimental Study.","authors":"Daniela Wiepert, Bradley A Malin, Joseph R Duffy, Rene L Utianski, John L Stricker, David T Jones, Hugo Botha","doi":"10.2196/52054","DOIUrl":"10.2196/52054","url":null,"abstract":"Background: Large curated data sets are required to leverage speech-based tools in health care. These are costly to produce, resulting in increased interest in data sharing. As speech can potentially identify speakers (ie, voiceprints), sharing recordings raises privacy concerns. This is especially relevant when working with patient data protected under the Health Insurance Portability and Accountability Act.Objective: We aimed to determine the reidentification risk for speech recordings, without reference to demographics or metadata, in clinical data sets considering both the size of the search space (ie, the number of comparisons that must be considered when reidentifying) and the nature of the speech recording (ie, the type of speech task).Methods: Using a state-of-the-art speaker identification model, we modeled an adversarial attack scenario in which an adversary uses a large data set of identified speech (hereafter, the known set) to reidentify as many unknown speakers in a shared data set (hereafter, the unknown set) as possible. We first considered the effect of search space size by attempting reidentification with various sizes of known and unknown sets using VoxCeleb, a data set with recordings of natural, connected speech from >7000 healthy speakers. We then repeated these tests with different types of recordings in each set to examine whether the nature of a speech recording influences reidentification risk. For these tests, we used our clinical data set composed of recordings of elicited speech tasks from 941 speakers.Results: We found that the risk was inversely related to the number of comparisons an adversary must consider (ie, the search space), with a positive linear correlation between the number of false acceptances (FAs) and the number of comparisons (r=0.69; P<.001). The true acceptances (TAs) stayed relatively stable, and the ratio between FAs and TAs rose from 0.02 at 1 × 105 comparisons to 1.41 at 6 × 106 comparisons, with a near 1:1 ratio at the midpoint of 3 × 106 comparisons. In effect, risk was high for a small search space but dropped as the search space grew. We also found that the nature of a speech recording influenced reidentification risk, with nonconnected speech (eg, vowel prolongation: FA/TA=98.5; alternating motion rate: FA/TA=8) being harder to identify than connected speech (eg, sentence repetition: FA/TA=0.54) in cross-task conditions. The inverse was mostly true in within-task conditions, with the FA/TA ratio for vowel prolongation and alternating motion rate dropping to 0.39 and 1.17, respectively.Conclusions: Our findings suggest that speaker identification models can be used to reidentify participants in specific circumstances, but in practice, the reidentification risk appears small. The variation in risk due to search space size and type of speech task ","PeriodicalId":73551,"journal":{"name":"JMIR AI","volume":"3 ","pages":"e52054"},"PeriodicalIF":0.0,"publicationDate":"2024-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11041495/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141322057","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

The Impact of Expectation Management and Model Transparency on Radiologists’ Trust and Utilization of AI Recommendations for Lung Nodule Assessment on Computed Tomography: Simulated Use Study 期望管理和模型透明度对放射科医生信任和使用计算机断层扫描肺结节评估人工智能建议的影响：模拟使用研究

JMIR AI

Pub Date : 2024-03-13 DOI: 10.2196/52211

Lotte J S Ewals, Lynn J J Heesterbeek, Bin Yu, Kasper van der Wulp, Dimitrios Mavroeidis, M. Funk, Chris C P Snijders, Igor Jacobs, Joost Nederend, J. Pluyter

Many promising artificial intelligence (AI) and computer-aided detection and diagnosis systems have been developed, but few have been successfully integrated into clinical practice. This is partially owing to a lack of user-centered design of AI-based computer-aided detection or diagnosis (AI-CAD) systems. We aimed to assess the impact of different onboarding tutorials and levels of AI model explainability on radiologists’ trust in AI and the use of AI recommendations in lung nodule assessment on computed tomography (CT) scans. In total, 20 radiologists from 7 Dutch medical centers performed lung nodule assessment on CT scans under different conditions in a simulated use study as part of a 2×2 repeated-measures quasi-experimental design. Two types of AI onboarding tutorials (reflective vs informative) and 2 levels of AI output (black box vs explainable) were designed. The radiologists first received an onboarding tutorial that was either informative or reflective. Subsequently, each radiologist assessed 7 CT scans, first without AI recommendations. AI recommendations were shown to the radiologist, and they could adjust their initial assessment. Half of the participants received the recommendations via black box AI output and half received explainable AI output. Mental model and psychological trust were measured before onboarding, after onboarding, and after assessing the 7 CT scans. We recorded whether radiologists changed their assessment on found nodules, malignancy prediction, and follow-up advice for each CT assessment. In addition, we analyzed whether radiologists’ trust in their assessments had changed based on the AI recommendations. Both variations of onboarding tutorials resulted in a significantly improved mental model of the AI-CAD system (informative P=.01 and reflective P=.01). After using AI-CAD, psychological trust significantly decreased for the group with explainable AI output (P=.02). On the basis of the AI recommendations, radiologists changed the number of reported nodules in 27 of 140 assessments, malignancy prediction in 32 of 140 assessments, and follow-up advice in 12 of 140 assessments. The changes were mostly an increased number of reported nodules, a higher estimated probability of malignancy, and earlier follow-up. The radiologists’ confidence in their found nodules changed in 82 of 140 assessments, in their estimated probability of malignancy in 50 of 140 assessments, and in their follow-up advice in 28 of 140 assessments. These changes were predominantly increases in confidence. The number of changed assessments and radiologists’ confidence did not significantly differ between the groups that received different onboarding tutorials and AI outputs. Onboarding tutorials help radiologists gain a better understanding of AI-CAD and facilitate the formation of a correct mental model. If AI explanations do not consistently substantiate the probability of malignancy across patient cases, radio

目前已开发出许多前景广阔的人工智能（AI）和计算机辅助检测与诊断系统，但成功融入临床实践的却寥寥无几。部分原因是基于人工智能的计算机辅助检测或诊断（AI-CAD）系统缺乏以用户为中心的设计。我们的目的是评估不同的上机教程和人工智能模型可解释性水平对放射科医生对人工智能的信任度以及在计算机断层扫描（CT）肺结节评估中使用人工智能建议的影响。作为2×2重复测量准实验设计的一部分，共有来自荷兰7家医疗中心的20名放射科医生在不同条件下对CT扫描进行了肺结节评估。研究设计了两种类型的人工智能入门教程（反思型与信息型）和两种级别的人工智能输出（黑箱与可解释型）。放射科医生首先接受信息型或反思型入门教程。随后，每位放射科医生评估了 7 份 CT 扫描，首先没有人工智能建议。然后向放射科医生展示人工智能建议，他们可以调整自己的初步评估。半数参与者通过黑盒人工智能输出接收建议，半数参与者接收可解释的人工智能输出。在上岗前、上岗后和评估 7 次 CT 扫描后，我们都对心理模型和心理信任度进行了测量。我们记录了放射科医生是否改变了他们对发现的结节、恶性肿瘤预测和每次 CT 评估的后续建议的评估。此外，我们还分析了根据人工智能建议，放射科医生对其评估的信任度是否发生了变化。两种不同的入职教程都显著改善了 AI-CAD 系统的心理模型（信息型 P=.01 和反射型 P=.01）。使用 AI-CAD 后，可解释的 AI 输出组的心理信任度明显下降（P=.02）。根据人工智能建议，放射科医生改变了 140 次评估中 27 次报告的结节数量、140 次评估中 32 次的恶性肿瘤预测以及 140 次评估中 12 次的随访建议。这些改变主要是增加了报告的结节数量，提高了恶性肿瘤的估计概率，并提前了随访时间。在 140 项评估中，有 82 项评估的放射科医生对所发现结节的信心发生了变化；在 140 项评估中，有 50 项评估的放射科医生对恶性肿瘤的估计概率发生了变化；在 140 项评估中，有 28 项评估的放射科医生对随访建议的信心发生了变化。这些变化主要是信心的增加。在接受不同入门教程和人工智能输出结果的组别中，评估结果发生变化的次数和放射科医生的信心并无明显差异。入门教程有助于放射科医生更好地了解 AI-CAD，促进形成正确的心理模型。如果人工智能的解释不能始终如一地证实不同病例的恶性肿瘤概率，放射科医生对 AI-CAD 系统的信任就会受损。通过使用人工智能建议，放射科医生对其评估结果的信心有所提高。

{"title":"The Impact of Expectation Management and Model Transparency on Radiologists’ Trust and Utilization of AI Recommendations for Lung Nodule Assessment on Computed Tomography: Simulated Use Study","authors":"Lotte J S Ewals, Lynn J J Heesterbeek, Bin Yu, Kasper van der Wulp, Dimitrios Mavroeidis, M. Funk, Chris C P Snijders, Igor Jacobs, Joost Nederend, J. Pluyter","doi":"10.2196/52211","DOIUrl":"https://doi.org/10.2196/52211","url":null,"abstract":"\u0000 \u0000 Many promising artificial intelligence (AI) and computer-aided detection and diagnosis systems have been developed, but few have been successfully integrated into clinical practice. This is partially owing to a lack of user-centered design of AI-based computer-aided detection or diagnosis (AI-CAD) systems.\u0000 \u0000 \u0000 \u0000 We aimed to assess the impact of different onboarding tutorials and levels of AI model explainability on radiologists’ trust in AI and the use of AI recommendations in lung nodule assessment on computed tomography (CT) scans.\u0000 \u0000 \u0000 \u0000 In total, 20 radiologists from 7 Dutch medical centers performed lung nodule assessment on CT scans under different conditions in a simulated use study as part of a 2×2 repeated-measures quasi-experimental design. Two types of AI onboarding tutorials (reflective vs informative) and 2 levels of AI output (black box vs explainable) were designed. The radiologists first received an onboarding tutorial that was either informative or reflective. Subsequently, each radiologist assessed 7 CT scans, first without AI recommendations. AI recommendations were shown to the radiologist, and they could adjust their initial assessment. Half of the participants received the recommendations via black box AI output and half received explainable AI output. Mental model and psychological trust were measured before onboarding, after onboarding, and after assessing the 7 CT scans. We recorded whether radiologists changed their assessment on found nodules, malignancy prediction, and follow-up advice for each CT assessment. In addition, we analyzed whether radiologists’ trust in their assessments had changed based on the AI recommendations.\u0000 \u0000 \u0000 \u0000 Both variations of onboarding tutorials resulted in a significantly improved mental model of the AI-CAD system (informative P=.01 and reflective P=.01). After using AI-CAD, psychological trust significantly decreased for the group with explainable AI output (P=.02). On the basis of the AI recommendations, radiologists changed the number of reported nodules in 27 of 140 assessments, malignancy prediction in 32 of 140 assessments, and follow-up advice in 12 of 140 assessments. The changes were mostly an increased number of reported nodules, a higher estimated probability of malignancy, and earlier follow-up. The radiologists’ confidence in their found nodules changed in 82 of 140 assessments, in their estimated probability of malignancy in 50 of 140 assessments, and in their follow-up advice in 28 of 140 assessments. These changes were predominantly increases in confidence. The number of changed assessments and radiologists’ confidence did not significantly differ between the groups that received different onboarding tutorials and AI outputs.\u0000 \u0000 \u0000 \u0000 Onboarding tutorials help radiologists gain a better understanding of AI-CAD and facilitate the formation of a correct mental model. If AI explanations do not consistently substantiate the probability of malignancy across patient cases, radio","PeriodicalId":73551,"journal":{"name":"JMIR AI","volume":"2013 6","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140246281","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

What Is the Performance of ChatGPT in Determining the Gender of Individuals Based on Their First and Last Names? ChatGPT 在根据名字和姓氏判断性别方面的表现如何？

JMIR AI

Pub Date : 2024-03-13 DOI: 10.2196/53656

Paul Sebo

引用次数: 0

Correction: Using Conversational AI to Facilitate Mental Health Assessments and Improve Clinical Efficiency Within Psychotherapy Services: Real-World Observational Study 更正：使用对话式人工智能促进心理健康评估，提高心理治疗服务的临床效率：真实世界观察研究

JMIR AI

Pub Date : 2024-03-12 DOI: 10.2196/57869

Max Rollwage, J. Habicht, Keno Juechems, Ben Carrington, Sruthi Viswanathan, Mona Stylianou, Tobias U Hauser, Ross Harper

引用次数: 0

Machine Learning Methods Using Artificial Intelligence Deployed on Electronic Health Record Data for Identification and Referral of At-Risk Patients From Primary Care Physicians to Eye Care Specialists: Retrospective, Case-Controlled Study. 利用人工智能在电子健康记录数据上部署机器学习方法，以识别高风险患者并将其从初级保健医生转诊至眼科专科医生：回顾性病例对照研究。

JMIR AI

Pub Date : 2024-03-12 DOI: 10.2196/48295

Joshua A Young, Chin-Wen Chang, Charles W Scales, Saurabh V Menon, Chantal E Holy, Caroline Adrienne Blackie

Background: Identification and referral of at-risk patients from primary care practitioners (PCPs) to eye care professionals remain a challenge. Approximately 1.9 million Americans suffer from vision loss as a result of undiagnosed or untreated ophthalmic conditions. In ophthalmology, artificial intelligence (AI) is used to predict glaucoma progression, recognize diabetic retinopathy (DR), and classify ocular tumors; however, AI has not yet been used to triage primary care patients for ophthalmology referral.

Objective: This study aimed to build and compare machine learning (ML) methods, applicable to electronic health records (EHRs) of PCPs, capable of triaging patients for referral to eye care specialists.

Methods: Accessing the Optum deidentified EHR data set, 743,039 patients with 5 leading vision conditions (age-related macular degeneration [AMD], visually significant cataract, DR, glaucoma, or ocular surface disease [OSD]) were exact-matched on age and gender to 743,039 controls without eye conditions. Between 142 and 182 non-ophthalmic parameters per patient were input into 5 ML methods: generalized linear model, L1-regularized logistic regression, random forest, Extreme Gradient Boosting (XGBoost), and J48 decision tree. Model performance was compared for each pathology to select the most predictive algorithm. The area under the curve (AUC) was assessed for all algorithms for each outcome.

Results: XGBoost demonstrated the best performance, showing, respectively, a prediction accuracy and an AUC of 78.6% (95% CI 78.3%-78.9%) and 0.878 for visually significant cataract, 77.4% (95% CI 76.7%-78.1%) and 0.858 for exudative AMD, 79.2% (95% CI 78.8%-79.6%) and 0.879 for nonexudative AMD, 72.2% (95% CI 69.9%-74.5%) and 0.803 for OSD requiring medication, 70.8% (95% CI 70.5%-71.1%) and 0.785 for glaucoma, 85.0% (95% CI 84.2%-85.8%) and 0.924 for type 1 nonproliferative diabetic retinopathy (NPDR), 82.2% (95% CI 80.4%-84.0%) and 0.911 for type 1 proliferative diabetic retinopathy (PDR), 81.3% (95% CI 81.0%-81.6%) and 0.891 for type 2 NPDR, and 82.1% (95% CI 81.3%-82.9%) and 0.900 for type 2 PDR.

Conclusions: The 5 ML methods deployed were able to successfully identify patients with elevated odds ratios (ORs), thus capable of patient triage, for ocular pathology ranging from 2.4 (95% CI 2.4-2.5) for glaucoma to 5.7 (95% CI 5.0-6.4) for type 1 NPDR, with an average OR of 3.9. The application of these models could enable PCPs to better identify and triage patients at risk for treatable ophthalmic pathology. Early identification of patients with unrecognized sight-threatening conditions may lead to earlier treatment and a reduced economic burden. More importantly, such triage may improve patients' lives.

背景：初级保健医生 (PCP) 识别高危患者并将其转介给眼科专业人员仍是一项挑战。约有 190 万美国人因未诊断或未治疗眼科疾病而导致视力下降。在眼科领域，人工智能（AI）被用于预测青光眼进展、识别糖尿病视网膜病变（DR）和眼部肿瘤分类；然而，人工智能尚未被用于眼科转诊的初级保健患者分流：本研究旨在建立和比较适用于初级保健医生电子健康记录（EHR）的机器学习（ML）方法，这些方法能够将患者分流以转诊给眼科专家：通过访问 Optum 去标识化 EHR 数据集，将 743,039 名患有 5 种主要视力疾病（年龄相关性黄斑变性 [AMD]、视力显著性白内障、DR、青光眼或眼表疾病 [OSD]）的患者与 743,039 名无眼部疾病的对照组进行年龄和性别精确匹配。每位患者的 142 到 182 个非眼科参数被输入到 5 种 ML 方法中：广义线性模型、L1 规则化逻辑回归、随机森林、极梯度提升 (XGBoost) 和 J48 决策树。对每种病理的模型性能进行比较，以选出最具预测性的算法。针对每种结果，对所有算法的曲线下面积（AUC）进行了评估：XGBoost表现最佳，其预测准确率和AUC分别为：视觉显著性白内障78.6%（95% CI 78.3%-78.9%）和0.878；渗出性AMD 77.4%（95% CI 76.7%-78.1%）和0.858；非渗出性AMD 79.2%（95% CI 78.8%-79.6%）和0.879；需要药物治疗的OSD 72.2%（95% CI 69.9%-74.5%）和0.803；需要药物治疗的OSD 70.8%（95% CI 70.3%-78.9%）和0.878。8%（95% CI 70.5%-71.1%）和 0.785，1 型非增殖性糖尿病视网膜病变 (NPDR) 为 85.0%（95% CI 84.2%-85.8%）和 0.924，1 型增殖性糖尿病视网膜病变 (NPDR) 为 82.2%（95% CI 80.4%-84.0%）和 0.924。1型增殖性糖尿病视网膜病变（PDR）为911，2型NPDR为81.3%（95% CI 81.0%-81.6%）和0.891，2型PDR为82.1%（95% CI 81.3%-82.9%）和0.900：所采用的 5 种 ML 方法能够成功识别出眼部病理几率比（OR）升高的患者，从而能够对患者进行分流，其范围从青光眼的 2.4（95% CI 2.4-2.5）到 1 型 NPDR 的 5.7（95% CI 5.0-6.4）不等，平均 OR 为 3.9。应用这些模型可使初级保健医生更好地识别和分流有可治疗眼科病变风险的患者。及早发现视力受到威胁的未确诊患者，可尽早治疗并减轻经济负担。更重要的是，这种分流可能会改善患者的生活。

{"title":"Machine Learning Methods Using Artificial Intelligence Deployed on Electronic Health Record Data for Identification and Referral of At-Risk Patients From Primary Care Physicians to Eye Care Specialists: Retrospective, Case-Controlled Study.","authors":"Joshua A Young, Chin-Wen Chang, Charles W Scales, Saurabh V Menon, Chantal E Holy, Caroline Adrienne Blackie","doi":"10.2196/48295","DOIUrl":"10.2196/48295","url":null,"abstract":"Background: Identification and referral of at-risk patients from primary care practitioners (PCPs) to eye care professionals remain a challenge. Approximately 1.9 million Americans suffer from vision loss as a result of undiagnosed or untreated ophthalmic conditions. In ophthalmology, artificial intelligence (AI) is used to predict glaucoma progression, recognize diabetic retinopathy (DR), and classify ocular tumors; however, AI has not yet been used to triage primary care patients for ophthalmology referral.Objective: This study aimed to build and compare machine learning (ML) methods, applicable to electronic health records (EHRs) of PCPs, capable of triaging patients for referral to eye care specialists.Methods: Accessing the Optum deidentified EHR data set, 743,039 patients with 5 leading vision conditions (age-related macular degeneration [AMD], visually significant cataract, DR, glaucoma, or ocular surface disease [OSD]) were exact-matched on age and gender to 743,039 controls without eye conditions. Between 142 and 182 non-ophthalmic parameters per patient were input into 5 ML methods: generalized linear model, L1-regularized logistic regression, random forest, Extreme Gradient Boosting (XGBoost), and J48 decision tree. Model performance was compared for each pathology to select the most predictive algorithm. The area under the curve (AUC) was assessed for all algorithms for each outcome.Results: XGBoost demonstrated the best performance, showing, respectively, a prediction accuracy and an AUC of 78.6% (95% CI 78.3%-78.9%) and 0.878 for visually significant cataract, 77.4% (95% CI 76.7%-78.1%) and 0.858 for exudative AMD, 79.2% (95% CI 78.8%-79.6%) and 0.879 for nonexudative AMD, 72.2% (95% CI 69.9%-74.5%) and 0.803 for OSD requiring medication, 70.8% (95% CI 70.5%-71.1%) and 0.785 for glaucoma, 85.0% (95% CI 84.2%-85.8%) and 0.924 for type 1 nonproliferative diabetic retinopathy (NPDR), 82.2% (95% CI 80.4%-84.0%) and 0.911 for type 1 proliferative diabetic retinopathy (PDR), 81.3% (95% CI 81.0%-81.6%) and 0.891 for type 2 NPDR, and 82.1% (95% CI 81.3%-82.9%) and 0.900 for type 2 PDR.Conclusions: The 5 ML methods deployed were able to successfully identify patients with elevated odds ratios (ORs), thus capable of patient triage, for ocular pathology ranging from 2.4 (95% CI 2.4-2.5) for glaucoma to 5.7 (95% CI 5.0-6.4) for type 1 NPDR, with an average OR of 3.9. The application of these models could enable PCPs to better identify and triage patients at risk for treatable ophthalmic pathology. Early identification of patients with unrecognized sight-threatening conditions may lead to earlier treatment and a reduced economic burden. More importantly, such triage may improve patients' lives.","PeriodicalId":73551,"journal":{"name":"JMIR AI","volume":"3 ","pages":"e48295"},"PeriodicalIF":0.0,"publicationDate":"2024-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11041486/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141322053","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Leveraging Machine Learning to Develop Digital Engagement Phenotypes of Users in a Digital Diabetes Prevention Program: Evaluation Study. 利用机器学习开发数字糖尿病预防计划用户的数字参与表型：评估研究。

JMIR AI

Pub Date : 2024-03-01 DOI: 10.2196/47122

Danissa V Rodriguez, Ji Chen, Ratnalekha V N Viswanadham, Katharine Lawrence, Devin Mann

Background: Digital diabetes prevention programs (dDPPs) are effective "digital prescriptions" but have high attrition rates and program noncompletion. To address this, we developed a personalized automatic messaging system (PAMS) that leverages SMS text messaging and data integration into clinical workflows to increase dDPP engagement via enhanced patient-provider communication. Preliminary data showed positive results. However, further investigation is needed to determine how to optimize the tailoring of support technology such as PAMS based on a user's preferences to boost their dDPP engagement.

Objective: This study evaluates leveraging machine learning (ML) to develop digital engagement phenotypes of dDPP users and assess ML's accuracy in predicting engagement with dDPP activities. This research will be used in a PAMS optimization process to improve PAMS personalization by incorporating engagement prediction and digital phenotyping. This study aims (1) to prove the feasibility of using dDPP user-collected data to build an ML model that predicts engagement and contributes to identifying digital engagement phenotypes, (2) to describe methods for developing ML models with dDPP data sets and present preliminary results, and (3) to present preliminary data on user profiling based on ML model outputs.

Methods: Using the gradient-boosted forest model, we predicted engagement in 4 dDPP individual activities (physical activity, lessons, social activity, and weigh-ins) and general activity (engagement in any activity) based on previous short- and long-term activity in the app. The area under the receiver operating characteristic curve, the area under the precision-recall curve, and the Brier score metrics determined the performance of the model. Shapley values reflected the feature importance of the models and determined what variables informed user profiling through latent profile analysis.

Results: We developed 2 models using weekly and daily DPP data sets (328,821 and 704,242 records, respectively), which yielded predictive accuracies above 90%. Although both models were highly accurate, the daily model better fitted our research plan because it predicted daily changes in individual activities, which was crucial for creating the "digital phenotypes." To better understand the variables contributing to the model predictor, we calculated the Shapley values for both models to identify the features with the highest contribution to model fit; engagement with any activity in the dDPP in the last 7 days had the most predictive power. We profiled users with latent profile analysis after 2 weeks of engagement (Bayesian information criterion=-3222.46) with the dDPP and identified 6 profiles of users, including those with high engagement, minimal engagement, and attrition.

Conclusions: Preliminary results demonstrate that applying ML methods with pre

背景：数字糖尿病预防计划（dDPPs）是一种有效的 "数字处方"，但流失率和计划未完成率较高。为解决这一问题，我们开发了个性化自动短信系统（PAMS），该系统利用短信和数据集成到临床工作流程中，通过加强患者与医生之间的沟通来提高糖尿病预防计划的参与度。初步数据显示效果良好。然而，如何根据用户的偏好优化定制 PAMS 等支持技术，以提高他们的 dDPP 参与度，还需要进一步研究：本研究评估了利用机器学习（ML）开发 dDPP 用户数字参与表型的情况，并评估了 ML 在预测 dDPP 活动参与度方面的准确性。这项研究将用于 PAMS 优化过程，通过结合参与度预测和数字表型来改进 PAMS 个性化。本研究旨在：（1）证明使用 dDPP 用户收集的数据建立一个预测参与度并有助于识别数字参与度表型的 ML 模型的可行性；（2）介绍使用 dDPP 数据集开发 ML 模型的方法并展示初步结果；（3）展示基于 ML 模型输出的用户剖析的初步数据：使用梯度增强森林模型，我们根据用户以前在应用程序中的短期和长期活动，预测了用户参与 4 项 dDPP 个人活动（体育活动、课程、社交活动和称重）和一般活动（参与任何活动）的情况。接收者操作特征曲线下面积、精确度-调用曲线下面积和布赖尔评分指标决定了模型的性能。Shapley 值反映了模型的特征重要性，并通过潜在特征分析确定了用户特征描述所依据的变量：我们使用每周和每天的 DPP 数据集（分别为 328,821 条和 704,242 条记录）开发了两个模型，预测准确率均超过 90%。虽然两个模型的准确率都很高，但每日模型更符合我们的研究计划，因为它能预测个人活动的每日变化，而这对创建 "数字表型 "至关重要。为了更好地了解对模型预测有贡献的变量，我们计算了两个模型的 Shapley 值，以确定对模型拟合贡献最大的特征；过去 7 天内参与 dDPP 中任何活动的预测能力最强。在用户参与 dDPP 2 周后（贝叶斯信息标准=-3222.46），我们使用潜在特征分析对用户进行了分析，并确定了 6 种用户特征，包括高参与度、最低参与度和流失：初步结果表明，应用具有预测能力的 ML 方法是一种可接受的机制，可用于定制和优化信息干预措施，以支持患者参与和坚持使用数字处方。这些结果有助于今后优化我们现有的信息平台，并将这一方法推广到其他临床领域：ClinicalTrials.gov NCT04773834；https://www.clinicaltrials.gov/ct2/show/NCT04773834.International 注册报告标识符 (irrid)：RR2-10.2196/26750。

{"title":"Leveraging Machine Learning to Develop Digital Engagement Phenotypes of Users in a Digital Diabetes Prevention Program: Evaluation Study.","authors":"Danissa V Rodriguez, Ji Chen, Ratnalekha V N Viswanadham, Katharine Lawrence, Devin Mann","doi":"10.2196/47122","DOIUrl":"10.2196/47122","url":null,"abstract":"Background: Digital diabetes prevention programs (dDPPs) are effective \"digital prescriptions\" but have high attrition rates and program noncompletion. To address this, we developed a personalized automatic messaging system (PAMS) that leverages SMS text messaging and data integration into clinical workflows to increase dDPP engagement via enhanced patient-provider communication. Preliminary data showed positive results. However, further investigation is needed to determine how to optimize the tailoring of support technology such as PAMS based on a user's preferences to boost their dDPP engagement.Objective: This study evaluates leveraging machine learning (ML) to develop digital engagement phenotypes of dDPP users and assess ML's accuracy in predicting engagement with dDPP activities. This research will be used in a PAMS optimization process to improve PAMS personalization by incorporating engagement prediction and digital phenotyping. This study aims (1) to prove the feasibility of using dDPP user-collected data to build an ML model that predicts engagement and contributes to identifying digital engagement phenotypes, (2) to describe methods for developing ML models with dDPP data sets and present preliminary results, and (3) to present preliminary data on user profiling based on ML model outputs.Methods: Using the gradient-boosted forest model, we predicted engagement in 4 dDPP individual activities (physical activity, lessons, social activity, and weigh-ins) and general activity (engagement in any activity) based on previous short- and long-term activity in the app. The area under the receiver operating characteristic curve, the area under the precision-recall curve, and the Brier score metrics determined the performance of the model. Shapley values reflected the feature importance of the models and determined what variables informed user profiling through latent profile analysis.Results: We developed 2 models using weekly and daily DPP data sets (328,821 and 704,242 records, respectively), which yielded predictive accuracies above 90%. Although both models were highly accurate, the daily model better fitted our research plan because it predicted daily changes in individual activities, which was crucial for creating the \"digital phenotypes.\" To better understand the variables contributing to the model predictor, we calculated the Shapley values for both models to identify the features with the highest contribution to model fit; engagement with any activity in the dDPP in the last 7 days had the most predictive power. We profiled users with latent profile analysis after 2 weeks of engagement (Bayesian information criterion=-3222.46) with the dDPP and identified 6 profiles of users, including those with high engagement, minimal engagement, and attrition.Conclusions: Preliminary results demonstrate that applying ML methods with pre","PeriodicalId":73551,"journal":{"name":"JMIR AI","volume":"3 ","pages":"e47122"},"PeriodicalIF":0.0,"publicationDate":"2024-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11041485/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141322106","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

JMIR AI最新文献