Journal of the American Medical Informatics Association最新文献_第7页

Correction to: Are medical history data fit for risk stratification of patients with chest pain in emergency care? Comparing data collected from patients using computerized history taking with data documented by physicians in the electronic health record in the CLEOS-CPDS prospective cohort study. 更正：病史数据是否适合对急诊胸痛患者进行风险分层？在 CLEOS-CPDS 前瞻性队列研究中，将使用电脑病史采集系统收集的患者数据与医生在电子健康记录中记录的数据进行比较。

IF 4.7 2区医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

Journal of the American Medical Informatics Association

Pub Date : 2025-01-01 DOI: 10.1093/jamia/ocae252

引用次数: 0

Comparative analysis of personal protective equipment nonadherence detection: computer vision versus human observers. 个人防护装备不符合性检测的比较分析：计算机视觉与人类观察者。

IF 4.7 2区医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

Journal of the American Medical Informatics Association

Pub Date : 2025-01-01 DOI: 10.1093/jamia/ocae262

Mary S Kim, Beomseok Park, Genevieve J Sippel, Aaron H Mun, Wanzhao Yang, Kathleen H McCarthy, Emely Fernandez, Marius George Linguraru, Aleksandra Sarcevic, Ivan Marsic, Randall S Burd

Objectives: Human monitoring of personal protective equipment (PPE) adherence among healthcare providers has several limitations, including the need for additional personnel during staff shortages and decreased vigilance during prolonged tasks. To address these challenges, we developed an automated computer vision system for monitoring PPE adherence in healthcare settings. We assessed the system performance against human observers detecting nonadherence in a video surveillance experiment.

Materials and methods: The automated system was trained to detect 15 classes of eyewear, masks, gloves, and gowns using an object detector and tracker. To assess how the system performs compared to human observers in detecting nonadherence, we designed a video surveillance experiment under 2 conditions: variations in video durations (20, 40, and 60 seconds) and the number of individuals in the videos (3 versus 6). Twelve nurses participated as human observers. Performance was assessed based on the number of detections of nonadherence.

Results: Human observers detected fewer instances of nonadherence than the system (parameter estimate -0.3, 95% CI -0.4 to -0.2, P < .001). Human observers detected more nonadherence during longer video durations (parameter estimate 0.7, 95% CI 0.4-1.0, P < .001). The system achieved a sensitivity of 0.86, specificity of 1, and Matthew's correlation coefficient of 0.82 for detecting PPE nonadherence.

Discussion: An automated system simultaneously tracks multiple objects and individuals. The system performance is also independent of observation duration, an improvement over human monitoring.

Conclusion: The automated system presents a potential solution for scalable monitoring of hospital-wide infection control practices and improving PPE usage in healthcare settings.

目标：人工监控医疗保健提供者对个人防护设备（PPE）的遵守情况有几个局限性，包括在人员短缺时需要额外的人员，以及在长时间工作时警惕性降低。为了应对这些挑战，我们开发了一种自动计算机视觉系统，用于监控医疗机构中个人防护设备的使用情况。我们在视频监控实验中评估了该系统与人类观察员检测不遵守情况的性能：使用物体检测器和跟踪器对自动系统进行了训练，以检测 15 类眼镜、口罩、手套和防护服。为了评估该系统与人类观察者相比在检测不遵守规定方面的表现，我们设计了一个视频监控实验，实验有两个条件：视频持续时间（20、40 和 60 秒）和视频中的人数（3 对 6）。12 名护士作为人类观察员参与了实验。根据检测到的不遵医嘱行为的数量来评估绩效：结果：人工观察者发现的不遵医嘱情况少于系统（参数估计值-0.3，95% CI -0.4至-0.2，P 讨论）：自动系统可同时追踪多个物体和个人。该系统的性能还不受观察时间长短的影响，这是对人工监控的一种改进：自动系统为可扩展的医院感染控制实践监控和改善医疗机构中个人防护设备的使用提供了一个潜在的解决方案。

{"title":"Comparative analysis of personal protective equipment nonadherence detection: computer vision versus human observers.","authors":"Mary S Kim, Beomseok Park, Genevieve J Sippel, Aaron H Mun, Wanzhao Yang, Kathleen H McCarthy, Emely Fernandez, Marius George Linguraru, Aleksandra Sarcevic, Ivan Marsic, Randall S Burd","doi":"10.1093/jamia/ocae262","DOIUrl":"10.1093/jamia/ocae262","url":null,"abstract":"Objectives: Human monitoring of personal protective equipment (PPE) adherence among healthcare providers has several limitations, including the need for additional personnel during staff shortages and decreased vigilance during prolonged tasks. To address these challenges, we developed an automated computer vision system for monitoring PPE adherence in healthcare settings. We assessed the system performance against human observers detecting nonadherence in a video surveillance experiment.Materials and methods: The automated system was trained to detect 15 classes of eyewear, masks, gloves, and gowns using an object detector and tracker. To assess how the system performs compared to human observers in detecting nonadherence, we designed a video surveillance experiment under 2 conditions: variations in video durations (20, 40, and 60 seconds) and the number of individuals in the videos (3 versus 6). Twelve nurses participated as human observers. Performance was assessed based on the number of detections of nonadherence.Results: Human observers detected fewer instances of nonadherence than the system (parameter estimate -0.3, 95% CI -0.4 to -0.2, P < .001). Human observers detected more nonadherence during longer video durations (parameter estimate 0.7, 95% CI 0.4-1.0, P < .001). The system achieved a sensitivity of 0.86, specificity of 1, and Matthew's correlation coefficient of 0.82 for detecting PPE nonadherence.Discussion: An automated system simultaneously tracks multiple objects and individuals. The system performance is also independent of observation duration, an improvement over human monitoring.Conclusion: The automated system presents a potential solution for scalable monitoring of hospital-wide infection control practices and improving PPE usage in healthcare settings.","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":"163-171"},"PeriodicalIF":4.7,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11648733/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142479222","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

The journey to building a diverse, equitable, and inclusive American Medical Informatics Association. 建立一个多元化、公平和包容的美国医学信息学协会的历程。

IF 4.7 2区医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

Journal of the American Medical Informatics Association

Pub Date : 2025-01-01 DOI: 10.1093/jamia/ocae258

Tiffani J Bright, Oliver J Bear Don't Walk Iv, Carl Erwin Johnson, Carolyn Petersen, Patricia C Dykes, Krista G Martin, Kevin B Johnson, Lois Walters-Threat, Catherine K Craven, Robert J Lucero, Gretchen P Jackson, Rubina F Rizvi

Objective: The American Medical Informatics Association (AMIA) Task Force on Diversity, Equity, and Inclusion (DEI) was established to address systemic racism and health disparities in biomedical and health informatics, aligning with AMIA's mission to transform healthcare. AMIA's DEI initiatives were spurred by member voices responding to police brutality and COVID-19's impact on Black/African American communities.

Materials and methods: The Task Force, consisting of 20 members across 3 groups aligned with AMIA's 2020-2025 Strategic Plan, met biweekly to develop DEI recommendations with the help of 16 additional volunteers. These recommendations were reviewed, prioritized, and presented to the AMIA Board of Directors for approval.

Results: In 9 months, the Task Force (1) created a logic model to support workforce diversity and raise AMIA's DEI awareness, (2) conducted an environmental scan of other associations' DEI activities, (3) developed a DEI framework for AMIA meetings, (4) gathered member feedback, (5) cultivated DEI educational resources, (6) created a Board nominations and diversity session, (7) reviewed the Board's Strategic Planning for DEI alignment, (8) led a program to increase diversity at the 2020 AMIA Virtual Annual Symposium, and (9) standardized socially-assigned race and ethnicity data collection.

Discussion: The Task Force proposed actionable recommendations that focused on AMIA's role in addressing systemic racism and health equity, helping the organization understand its member diversity.

Conclusion: This work supported marginalized groups, broadened the research agenda, and positioned AMIA as a DEI leader while reinforcing the need for ongoing transformation within informatics.

目标：美国医学信息学协会（American Medical Informatics Association，AMIA）多样性、公平性和包容性（Diversity, Equity, and Inclusion，DEI）工作组的成立旨在解决生物医学和健康信息学中的系统性种族主义和健康差异问题，这与 AMIA 改变医疗保健的使命相一致。AMIA的 "多样性与包容性"（DEI）倡议是由成员对警察暴力和COVID-19对黑人/非裔美国人社区的影响所发出的呼声推动的：工作组由 20 名成员组成，涉及 3 个与 AMIA 2020-2025 年战略计划相一致的小组，每两周召开一次会议，在另外 16 名志愿者的帮助下制定 DEI 建议。这些建议经过审核、排定优先次序后，提交给 AMIA 董事会批准：在 9 个月的时间里，特别工作组（1）创建了一个逻辑模型，以支持劳动力多样性并提高 AMIA 的 DEI 意识；（2）对其他协会的 DEI 活动进行了环境扫描；（3）为 AMIA 会议制定了 DEI 框架；（4）收集了会员反馈意见；（5）开发了 DEI 教育资源、(6) 创建了董事会提名和多样性会议，(7) 审查了董事会的战略规划，使其与 DEI 保持一致，(8) 在 2020 年 AMIA 虚拟年度研讨会上领导了一项提高多样性的计划，(9) 将社会分配的种族和民族数据收集标准化。讨论：工作组提出了可操作的建议，重点关注 AMIA 在解决系统性种族主义和健康公平方面的作用，帮助该组织了解其成员的多样性：这项工作为边缘化群体提供了支持，拓宽了研究议程，并将 AMIA 定位为 DEI 领导者，同时加强了信息学内部持续转型的必要性。

{"title":"The journey to building a diverse, equitable, and inclusive American Medical Informatics Association.","authors":"Tiffani J Bright, Oliver J Bear Don't Walk Iv, Carl Erwin Johnson, Carolyn Petersen, Patricia C Dykes, Krista G Martin, Kevin B Johnson, Lois Walters-Threat, Catherine K Craven, Robert J Lucero, Gretchen P Jackson, Rubina F Rizvi","doi":"10.1093/jamia/ocae258","DOIUrl":"10.1093/jamia/ocae258","url":null,"abstract":"Objective: The American Medical Informatics Association (AMIA) Task Force on Diversity, Equity, and Inclusion (DEI) was established to address systemic racism and health disparities in biomedical and health informatics, aligning with AMIA's mission to transform healthcare. AMIA's DEI initiatives were spurred by member voices responding to police brutality and COVID-19's impact on Black/African American communities.Materials and methods: The Task Force, consisting of 20 members across 3 groups aligned with AMIA's 2020-2025 Strategic Plan, met biweekly to develop DEI recommendations with the help of 16 additional volunteers. These recommendations were reviewed, prioritized, and presented to the AMIA Board of Directors for approval.Results: In 9 months, the Task Force (1) created a logic model to support workforce diversity and raise AMIA's DEI awareness, (2) conducted an environmental scan of other associations' DEI activities, (3) developed a DEI framework for AMIA meetings, (4) gathered member feedback, (5) cultivated DEI educational resources, (6) created a Board nominations and diversity session, (7) reviewed the Board's Strategic Planning for DEI alignment, (8) led a program to increase diversity at the 2020 AMIA Virtual Annual Symposium, and (9) standardized socially-assigned race and ethnicity data collection.Discussion: The Task Force proposed actionable recommendations that focused on AMIA's role in addressing systemic racism and health equity, helping the organization understand its member diversity.Conclusion: This work supported marginalized groups, broadened the research agenda, and positioned AMIA as a DEI leader while reinforcing the need for ongoing transformation within informatics.","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":"3-8"},"PeriodicalIF":4.7,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11648708/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142479236","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

The role of routine and structured social needs data collection in improving care in US hospitals. 常规和结构化社会需求数据收集在改善美国医院护理方面的作用。

IF 4.7 2区医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

Journal of the American Medical Informatics Association

Pub Date : 2025-01-01 DOI: 10.1093/jamia/ocae279

Chelsea Richwine, Vaishali Patel, Jordan Everson, Bradley Iott

Objectives: To understand how health-related social needs (HRSN) data are collected at US hospitals and implications for use.

Materials and methods: Using 2023 nationally representative survey data on US hospitals (N = 2775), we described hospitals' routine and structured collection and use of HRSN data and examined the relationship between methods of data collection and specific uses. Multivariate logistic regression was used to identify characteristics associated with data collection and use and understand how methods of data collection relate to use.

Results: In 2023, 88% of hospitals collected HRSN data (64% routinely, 72% structured). While hospitals commonly used data for internal purposes (eg, discharge planning, 79%), those that collected data routinely and in a structured format (58%) used data for purposes involving coordination or exchange with other organizations (eg, making referrals, 74%) at higher rates than hospitals that collected data but not routinely or in a non-structured format (eg, 93% vs 67% for referrals, P< .05). In multivariate regression, routine and structured data collection was positively associated with all uses of data examined. Hospital location, ownership, system-affiliation, value-based care participation, and critical access designation were associated with HRSN data collection, but only system-affiliation was consistently (positively) associated with use.

Discussion: While most hospitals screen for social needs, fewer collect data routinely and in a structured format that would facilitate downstream use. Routine and structured data collection was associated with greater use, particularly for secondary purposes.

Conclusion: Routine and structured screening may result in more actionable data that facilitates use for various purposes that support patient care and improve community and population health, indicating the importance of continuing efforts to increase routine screening and standardize HRSN data collection.

目的：了解美国医院如何收集与健康相关的社会需求（HRSN）数据及其使用意义：了解美国医院如何收集与健康相关的社会需求（HRSN）数据及其对使用的影响：利用 2023 年美国医院的全国代表性调查数据（N = 2775），我们描述了医院对 HRSN 数据的常规和结构化收集与使用情况，并研究了数据收集方法与具体使用之间的关系。我们使用多变量逻辑回归来确定与数据收集和使用相关的特征，并了解数据收集方法与使用之间的关系：2023 年，88% 的医院收集了 HRSN 数据（64% 为常规数据，72% 为结构化数据）。虽然医院通常将数据用于内部目的（如出院计划，79%），但那些常规收集数据并采用结构化格式的医院（58%）将数据用于与其他组织协调或交流的目的（如转诊，74%），其使用率高于那些未常规收集数据或采用非结构化格式的医院（如转诊，93% vs 67%，P< .05）。在多变量回归中，常规和结构化的数据收集与数据的所有用途均呈正相关。医院位置、所有权、系统隶属关系、基于价值的护理参与度和关键准入指定与 HRSN 数据收集有关，但只有系统隶属关系与数据使用持续（正）相关：讨论：虽然大多数医院都会对社会需求进行筛查，但以常规和结构化格式收集数据以方便下游使用的医院较少。常规和结构化的数据收集与更大程度的使用有关，尤其是用于次要目的：常规和结构化筛查可能会产生更多可操作的数据，便于用于支持患者护理、改善社区和人口健康的各种目的，这表明继续努力增加常规筛查和规范 HRSN 数据收集的重要性。

{"title":"The role of routine and structured social needs data collection in improving care in US hospitals.","authors":"Chelsea Richwine, Vaishali Patel, Jordan Everson, Bradley Iott","doi":"10.1093/jamia/ocae279","DOIUrl":"10.1093/jamia/ocae279","url":null,"abstract":"Objectives: To understand how health-related social needs (HRSN) data are collected at US hospitals and implications for use.Materials and methods: Using 2023 nationally representative survey data on US hospitals (N = 2775), we described hospitals' routine and structured collection and use of HRSN data and examined the relationship between methods of data collection and specific uses. Multivariate logistic regression was used to identify characteristics associated with data collection and use and understand how methods of data collection relate to use.Results: In 2023, 88% of hospitals collected HRSN data (64% routinely, 72% structured). While hospitals commonly used data for internal purposes (eg, discharge planning, 79%), those that collected data routinely and in a structured format (58%) used data for purposes involving coordination or exchange with other organizations (eg, making referrals, 74%) at higher rates than hospitals that collected data but not routinely or in a non-structured format (eg, 93% vs 67% for referrals, P< .05). In multivariate regression, routine and structured data collection was positively associated with all uses of data examined. Hospital location, ownership, system-affiliation, value-based care participation, and critical access designation were associated with HRSN data collection, but only system-affiliation was consistently (positively) associated with use.Discussion: While most hospitals screen for social needs, fewer collect data routinely and in a structured format that would facilitate downstream use. Routine and structured data collection was associated with greater use, particularly for secondary purposes.Conclusion: Routine and structured screening may result in more actionable data that facilitates use for various purposes that support patient care and improve community and population health, indicating the importance of continuing efforts to increase routine screening and standardize HRSN data collection.","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":"28-37"},"PeriodicalIF":4.7,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11648711/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142591563","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Correction to: Artificial intelligence for optimizing recruitment and retention in clinical trials: a scoping review. 更正：人工智能优化临床试验的招募和保留：范围综述。

IF 4.7 2区医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

Journal of the American Medical Informatics Association

Pub Date : 2025-01-01 DOI: 10.1093/jamia/ocae283

引用次数: 0

Is ChatGPT worthy enough for provisioning clinical decision support? ChatGPT 是否足以提供临床决策支持？

IF 4.7 2区医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

Journal of the American Medical Informatics Association

Pub Date : 2025-01-01 DOI: 10.1093/jamia/ocae282

Partha Pratim Ray

引用次数: 0

Machine learning-based infection diagnostic and prognostic models in post-acute care settings: a systematic review. 基于机器学习的急性期后护理环境感染诊断和预后模型：系统综述。

IF 4.7 2区医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

Journal of the American Medical Informatics Association

Pub Date : 2025-01-01 DOI: 10.1093/jamia/ocae278

Zidu Xu, Danielle Scharp, Mollie Hobensack, Jiancheng Ye, Jungang Zou, Sirui Ding, Jingjing Shang, Maxim Topaz

Objectives: This study aims to (1) review machine learning (ML)-based models for early infection diagnostic and prognosis prediction in post-acute care (PAC) settings, (2) identify key risk predictors influencing infection-related outcomes, and (3) examine the quality and limitations of these models.

Materials and methods: PubMed, Web of Science, Scopus, IEEE Xplore, CINAHL, and ACM digital library were searched in February 2024. Eligible studies leveraged PAC data to develop and evaluate ML models for infection-related risks. Data extraction followed the CHARMS checklist. Quality appraisal followed the PROBAST tool. Data synthesis was guided by the socio-ecological conceptual framework.

Results: Thirteen studies were included, mainly focusing on respiratory infections and nursing homes. Most used regression models with structured electronic health record data. Since 2020, there has been a shift toward advanced ML algorithms and multimodal data, biosensors, and clinical notes being significant sources of unstructured data. Despite these advances, there is insufficient evidence to support performance improvements over traditional models. Individual-level risk predictors, like impaired cognition, declined function, and tachycardia, were commonly used, while contextual-level predictors were barely utilized, consequently limiting model fairness. Major sources of bias included lack of external validation, inadequate model calibration, and insufficient consideration of data complexity.

Discussion and conclusion: Despite the growth of advanced modeling approaches in infection-related models in PAC settings, evidence supporting their superiority remains limited. Future research should leverage a socio-ecological lens for predictor selection and model construction, exploring optimal data modalities and ML model usage in PAC, while ensuring rigorous methodologies and fairness considerations.

研究目的本研究旨在：(1) 综述基于机器学习（ML）的急性期后护理（PAC）环境中早期感染诊断和预后预测模型；(2) 确定影响感染相关结果的关键风险预测因素；(3) 检验这些模型的质量和局限性：于 2024 年 2 月检索了 PubMed、Web of Science、Scopus、IEEE Xplore、CINAHL 和 ACM 数字图书馆。符合条件的研究利用 PAC 数据开发并评估了感染相关风险的 ML 模型。数据提取遵循 CHARMS 核对表。质量评估采用 PROBAST 工具。数据综合以社会生态概念框架为指导：共纳入 13 项研究，主要集中在呼吸道感染和疗养院。大多数研究使用了结构化电子健康记录数据回归模型。自 2020 年以来，先进的 ML 算法、多模态数据、生物传感器和临床笔记已成为非结构化数据的重要来源。尽管取得了这些进展，但仍没有足够的证据支持其性能比传统模型有所提高。个体层面的风险预测因素，如认知能力受损、功能下降和心动过速等，被普遍使用，而情境层面的预测因素几乎未被使用，从而限制了模型的公平性。偏差的主要来源包括缺乏外部验证、模型校准不足以及对数据复杂性考虑不足：尽管先进的建模方法在 PAC 环境中的感染相关模型中得到了发展，但支持其优越性的证据仍然有限。未来的研究应利用社会生态学的视角来选择预测因子和构建模型，探索 PAC 中的最佳数据模式和 ML 模型用法，同时确保采用严格的方法并考虑公平性。

{"title":"Machine learning-based infection diagnostic and prognostic models in post-acute care settings: a systematic review.","authors":"Zidu Xu, Danielle Scharp, Mollie Hobensack, Jiancheng Ye, Jungang Zou, Sirui Ding, Jingjing Shang, Maxim Topaz","doi":"10.1093/jamia/ocae278","DOIUrl":"10.1093/jamia/ocae278","url":null,"abstract":"Objectives: This study aims to (1) review machine learning (ML)-based models for early infection diagnostic and prognosis prediction in post-acute care (PAC) settings, (2) identify key risk predictors influencing infection-related outcomes, and (3) examine the quality and limitations of these models.Materials and methods: PubMed, Web of Science, Scopus, IEEE Xplore, CINAHL, and ACM digital library were searched in February 2024. Eligible studies leveraged PAC data to develop and evaluate ML models for infection-related risks. Data extraction followed the CHARMS checklist. Quality appraisal followed the PROBAST tool. Data synthesis was guided by the socio-ecological conceptual framework.Results: Thirteen studies were included, mainly focusing on respiratory infections and nursing homes. Most used regression models with structured electronic health record data. Since 2020, there has been a shift toward advanced ML algorithms and multimodal data, biosensors, and clinical notes being significant sources of unstructured data. Despite these advances, there is insufficient evidence to support performance improvements over traditional models. Individual-level risk predictors, like impaired cognition, declined function, and tachycardia, were commonly used, while contextual-level predictors were barely utilized, consequently limiting model fairness. Major sources of bias included lack of external validation, inadequate model calibration, and insufficient consideration of data complexity.Discussion and conclusion: Despite the growth of advanced modeling approaches in infection-related models in PAC settings, evidence supporting their superiority remains limited. Future research should leverage a socio-ecological lens for predictor selection and model construction, exploring optimal data modalities and ML model usage in PAC, while ensuring rigorous methodologies and fairness considerations.","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":"241-252"},"PeriodicalIF":4.7,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11648729/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142631465","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Using large language models to detect outcomes in qualitative studies of adolescent depression. 使用大型语言模型来检测青少年抑郁症定性研究的结果。

IF 4.7 2区医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

Journal of the American Medical Informatics Association

Pub Date : 2024-12-11 DOI: 10.1093/jamia/ocae298

Alison W Xin, Dylan M Nielson, Karolin Rose Krause, Guilherme Fiorini, Nick Midgley, Francisco Pereira, Juan Antonio Lossio-Ventura

Objective: We aim to use large language models (LLMs) to detect mentions of nuanced psychotherapeutic outcomes and impacts than previously considered in transcripts of interviews with adolescent depression. Our clinical authors previously created a novel coding framework containing fine-grained therapy outcomes beyond the binary classification (eg, depression vs control) based on qualitative analysis embedded within a clinical study of depression. Moreover, we seek to demonstrate that embeddings from LLMs are informative enough to accurately label these experiences.

Materials and methods: Data were drawn from interviews, where text segments were annotated with different outcome labels. Five different open-source LLMs were evaluated to classify outcomes from the coding framework. Classification experiments were carried out in the original interview transcripts. Furthermore, we repeated those experiments for versions of the data produced by breaking those segments into conversation turns, or keeping non-interviewer utterances (monologues).

Results: We used classification models to predict 31 outcomes and 8 derived labels, for 3 different text segmentations. Area under the ROC curve scores ranged between 0.6 and 0.9 for the original segmentation and 0.7 and 1.0 for the monologues and turns.

Discussion: LLM-based classification models could identify outcomes important to adolescents, such as friendships or academic and vocational functioning, in text transcripts of patient interviews. By using clinical data, we also aim to better generalize to clinical settings compared to studies based on public social media data.

Conclusion: Our results demonstrate that fine-grained therapy outcome coding in psychotherapeutic text is feasible, and can be used to support the quantification of important outcomes for downstream uses.

目的：我们的目标是使用大型语言模型（LLMs）来检测提及的细致入微的心理治疗结果和影响，而不是之前在青少年抑郁症访谈记录中考虑的。我们的临床作者之前创建了一个新的编码框架，其中包含了超越二元分类（例如，抑郁症与对照组）的细粒度治疗结果，该框架基于抑郁症临床研究中的定性分析。此外，我们试图证明法学硕士的嵌入信息足够准确地标记这些经验。材料和方法：数据来自访谈，其中文本片段用不同的结果标签进行注释。评估了五种不同的开源llm，以对编码框架的结果进行分类。对原始访谈笔录进行分类实验。此外，我们重复了这些实验，通过将这些片段分解为对话回合，或保留非采访者的话语（独白）来产生不同版本的数据。结果：我们使用分类模型预测了31个结果和8个衍生标签，用于3种不同的文本分割。原始分割的ROC曲线下面积得分在0.6到0.9之间，独白和回合得分在0.7到1.0之间。讨论：基于法学硕士的分类模型可以识别对青少年重要的结果，如友谊或学术和职业功能，在患者访谈的文本记录中。通过使用临床数据，与基于公共社交媒体数据的研究相比，我们还旨在更好地推广到临床环境。结论：我们的研究结果表明，在心理治疗文本中进行细粒度的治疗结果编码是可行的，并且可以用于支持下游用途的重要结果的量化。

{"title":"Using large language models to detect outcomes in qualitative studies of adolescent depression.","authors":"Alison W Xin, Dylan M Nielson, Karolin Rose Krause, Guilherme Fiorini, Nick Midgley, Francisco Pereira, Juan Antonio Lossio-Ventura","doi":"10.1093/jamia/ocae298","DOIUrl":"https://doi.org/10.1093/jamia/ocae298","url":null,"abstract":"Objective: We aim to use large language models (LLMs) to detect mentions of nuanced psychotherapeutic outcomes and impacts than previously considered in transcripts of interviews with adolescent depression. Our clinical authors previously created a novel coding framework containing fine-grained therapy outcomes beyond the binary classification (eg, depression vs control) based on qualitative analysis embedded within a clinical study of depression. Moreover, we seek to demonstrate that embeddings from LLMs are informative enough to accurately label these experiences.Materials and methods: Data were drawn from interviews, where text segments were annotated with different outcome labels. Five different open-source LLMs were evaluated to classify outcomes from the coding framework. Classification experiments were carried out in the original interview transcripts. Furthermore, we repeated those experiments for versions of the data produced by breaking those segments into conversation turns, or keeping non-interviewer utterances (monologues).Results: We used classification models to predict 31 outcomes and 8 derived labels, for 3 different text segmentations. Area under the ROC curve scores ranged between 0.6 and 0.9 for the original segmentation and 0.7 and 1.0 for the monologues and turns.Discussion: LLM-based classification models could identify outcomes important to adolescents, such as friendships or academic and vocational functioning, in text transcripts of patient interviews. By using clinical data, we also aim to better generalize to clinical settings compared to studies based on public social media data.Conclusion: Our results demonstrate that fine-grained therapy outcome coding in psychotherapeutic text is feasible, and can be used to support the quantification of important outcomes for downstream uses.","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":""},"PeriodicalIF":4.7,"publicationDate":"2024-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142814632","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Empowering the biomedical research community: Innovative SAS deployment on the All of Us Researcher Workbench. 增强生物医学研究界的能力：在 "全民研究员工作台 "上创新部署 SAS。

IF 4.7 2区医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

Journal of the American Medical Informatics Association

Pub Date : 2024-12-01 DOI: 10.1093/jamia/ocae216

Izabelle Humes, Cathy Shyr, Moira Dillon, Zhongjie Liu, Jennifer Peterson, Chris St Jeor, Jacqueline Malkes, Hiral Master, Brandy Mapes, Romuladus Azuine, Nakia Mack, Bassent Abdelbary, Joyonna Gamble-George, Emily Goldmann, Stephanie Cook, Fatemeh Choupani, Rubin Baskir, Sydney McMaster, Chris Lunt, Karriem Watson, Minnkyong Lee, Sophie Schwartz, Ruchi Munshi, David Glazer, Eric Banks, Anthony Philippakis, Melissa Basford, Dan Roden, Paul A Harris

Objectives: The All of Us Research Program is a precision medicine initiative aimed at establishing a vast, diverse biomedical database accessible through a cloud-based data analysis platform, the Researcher Workbench (RW). Our goal was to empower the research community by co-designing the implementation of SAS in the RW alongside researchers to enable broader use of All of Us data.

Materials and methods: Researchers from various fields and with different SAS experience levels participated in co-designing the SAS implementation through user experience interviews.

Results: Feedback and lessons learned from user testing informed the final design of the SAS application.

Discussion: The co-design approach is critical for reducing technical barriers, broadening All of Us data use, and enhancing the user experience for data analysis on the RW.

Conclusion: Our co-design approach successfully tailored the implementation of the SAS application to researchers' needs. This approach may inform future software implementations on the RW.

目标：我们所有人研究计划是一项精准医学计划，旨在建立一个庞大、多样的生物医学数据库，可通过基于云的数据分析平台--研究者工作台（RW）进行访问。我们的目标是通过与研究人员共同设计 RW 中 SAS 的实施来增强研究社区的能力，从而更广泛地使用 All of Us 数据：来自不同领域、具有不同 SAS 经验水平的研究人员通过用户体验访谈参与了 SAS 实施的共同设计：结果：从用户测试中获得的反馈和经验教训为 SAS 应用程序的最终设计提供了依据：讨论：共同设计方法对于减少技术障碍、扩大 "我们所有人 "数据的使用范围以及增强用户在 RW 上进行数据分析的体验至关重要：我们的共同设计方法成功地使 SAS 应用程序的实施符合研究人员的需求。这种方法可为未来在 RW 上实施软件提供参考。

{"title":"Empowering the biomedical research community: Innovative SAS deployment on the All of Us Researcher Workbench.","authors":"Izabelle Humes, Cathy Shyr, Moira Dillon, Zhongjie Liu, Jennifer Peterson, Chris St Jeor, Jacqueline Malkes, Hiral Master, Brandy Mapes, Romuladus Azuine, Nakia Mack, Bassent Abdelbary, Joyonna Gamble-George, Emily Goldmann, Stephanie Cook, Fatemeh Choupani, Rubin Baskir, Sydney McMaster, Chris Lunt, Karriem Watson, Minnkyong Lee, Sophie Schwartz, Ruchi Munshi, David Glazer, Eric Banks, Anthony Philippakis, Melissa Basford, Dan Roden, Paul A Harris","doi":"10.1093/jamia/ocae216","DOIUrl":"10.1093/jamia/ocae216","url":null,"abstract":"Objectives: The All of Us Research Program is a precision medicine initiative aimed at establishing a vast, diverse biomedical database accessible through a cloud-based data analysis platform, the Researcher Workbench (RW). Our goal was to empower the research community by co-designing the implementation of SAS in the RW alongside researchers to enable broader use of All of Us data.Materials and methods: Researchers from various fields and with different SAS experience levels participated in co-designing the SAS implementation through user experience interviews.Results: Feedback and lessons learned from user testing informed the final design of the SAS application.Discussion: The co-design approach is critical for reducing technical barriers, broadening All of Us data use, and enhancing the user experience for data analysis on the RW.Conclusion: Our co-design approach successfully tailored the implementation of the SAS application to researchers' needs. This approach may inform future software implementations on the RW.","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":"2994-3000"},"PeriodicalIF":4.7,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11631098/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141972205","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Multi-modality risk prediction of cardiovascular diseases for breast cancer cohort in the All of Us Research Program. 全民研究计划中乳腺癌队列的心血管疾病多模式风险预测。

IF 4.7 2区医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

Journal of the American Medical Informatics Association

Pub Date : 2024-12-01 DOI: 10.1093/jamia/ocae199

Han Yang, Sicheng Zhou, Zexi Rao, Chen Zhao, Erjia Cui, Chetan Shenoy, Anne H Blaes, Nishitha Paidimukkala, Jinhua Wang, Jue Hou, Rui Zhang

Objective: This study leverages the rich diversity of the All of Us Research Program (All of Us)'s dataset to devise a predictive model for cardiovascular disease (CVD) in breast cancer (BC) survivors. Central to this endeavor is the creation of a robust data integration pipeline that synthesizes electronic health records (EHRs), patient surveys, and genomic data, while upholding fairness across demographic variables.

Materials and methods: We have developed a universal data wrangling pipeline to process and merge heterogeneous data sources of the All of Us dataset, address missingness and variance in data, and align disparate data modalities into a coherent framework for analysis. Utilizing a composite feature set including EHR, lifestyle, and social determinants of health (SDoH) data, we then employed Adaptive Lasso and Random Forest regression models to predict 6 CVD outcomes. The models were evaluated using the c-index and time-dependent Area Under the Receiver Operating Characteristic Curve over a 10-year period.

Results: The Adaptive Lasso model showed consistent performance across most CVD outcomes, while the Random Forest model excelled particularly in predicting outcomes like transient ischemic attack when incorporating the full multi-model feature set. Feature importance analysis revealed age and previous coronary events as dominant predictors across CVD outcomes, with SDoH clustering labels highlighting the nuanced impact of social factors.

Discussion: The development of both Cox-based predictive model and Random Forest Regression model represents the extensive application of the All of Us, in integrating EHR and patient surveys to enhance precision medicine. And the inclusion of SDoH clustering labels revealed the significant impact of sociobehavioral factors on patient outcomes, emphasizing the importance of comprehensive health determinants in predictive models. Despite these advancements, limitations include the exclusion of genetic data, broad categorization of CVD conditions, and the need for fairness analyses to ensure equitable model performance across diverse populations. Future work should refine clinical and social variable measurements, incorporate advanced imputation techniques, and explore additional predictive algorithms to enhance model precision and fairness.

Conclusion: This study demonstrates the liability of the All of Us's diverse dataset in developing a multi-modality predictive model for CVD in BC survivors risk stratification in oncological survivorship. The data integration pipeline and subsequent predictive models establish a methodological foundation for future research into personalized healthcare.

研究目的本研究利用 "我们所有人研究计划"（All of Us）数据集的丰富多样性，设计出乳腺癌（BC）幸存者心血管疾病（CVD）的预测模型。这项工作的核心是创建一个强大的数据集成管道，该管道可综合电子健康记录（EHR）、患者调查和基因组数据，同时维护不同人口统计学变量之间的公平性：我们开发了一个通用数据处理管道，用于处理和合并 "我们所有人 "数据集的异构数据源，解决数据缺失和数据差异问题，并将不同的数据模式整合到一个连贯的分析框架中。利用包括电子病历、生活方式和健康的社会决定因素 (SDoH) 数据在内的复合特征集，我们采用自适应拉索和随机森林回归模型来预测 6 种心血管疾病的结果。在 10 年的时间里，我们使用 c 指数和随时间变化的接收者工作特征曲线下面积对模型进行了评估：结果：自适应套索模型在大多数心血管疾病结果中表现出一致的性能，而随机森林模型在预测短暂性脑缺血发作等结果时表现尤为突出，因为它结合了完整的多模型特征集。特征重要性分析表明，年龄和既往冠心病事件是预测心血管疾病结果的主要因素，而SDoH聚类标签则突出了社会因素的细微影响：基于 Cox 的预测模型和随机森林回归模型的开发代表了 "我们所有人 "在整合电子病历和患者调查以提高精准医疗方面的广泛应用。SDoH聚类标签的加入揭示了社会行为因素对患者预后的重大影响，强调了预测模型中综合健康决定因素的重要性。尽管取得了这些进步，但仍存在一些局限性，包括未纳入基因数据、心血管疾病分类过宽，以及需要进行公平性分析以确保模型在不同人群中的公平表现。未来的工作应完善临床和社会变量测量，采用先进的估算技术，并探索更多的预测算法，以提高模型的精确性和公平性：本研究证明了 "我们所有人 "的多样化数据集在开发多模式预测模型以预测不列颠哥伦比亚省幸存者心血管疾病方面的作用。数据整合管道和后续预测模型为未来个性化医疗保健研究奠定了方法论基础。

{"title":"Multi-modality risk prediction of cardiovascular diseases for breast cancer cohort in the All of Us Research Program.","authors":"Han Yang, Sicheng Zhou, Zexi Rao, Chen Zhao, Erjia Cui, Chetan Shenoy, Anne H Blaes, Nishitha Paidimukkala, Jinhua Wang, Jue Hou, Rui Zhang","doi":"10.1093/jamia/ocae199","DOIUrl":"10.1093/jamia/ocae199","url":null,"abstract":"Objective: This study leverages the rich diversity of the All of Us Research Program (All of Us)'s dataset to devise a predictive model for cardiovascular disease (CVD) in breast cancer (BC) survivors. Central to this endeavor is the creation of a robust data integration pipeline that synthesizes electronic health records (EHRs), patient surveys, and genomic data, while upholding fairness across demographic variables.Materials and methods: We have developed a universal data wrangling pipeline to process and merge heterogeneous data sources of the All of Us dataset, address missingness and variance in data, and align disparate data modalities into a coherent framework for analysis. Utilizing a composite feature set including EHR, lifestyle, and social determinants of health (SDoH) data, we then employed Adaptive Lasso and Random Forest regression models to predict 6 CVD outcomes. The models were evaluated using the c-index and time-dependent Area Under the Receiver Operating Characteristic Curve over a 10-year period.Results: The Adaptive Lasso model showed consistent performance across most CVD outcomes, while the Random Forest model excelled particularly in predicting outcomes like transient ischemic attack when incorporating the full multi-model feature set. Feature importance analysis revealed age and previous coronary events as dominant predictors across CVD outcomes, with SDoH clustering labels highlighting the nuanced impact of social factors.Discussion: The development of both Cox-based predictive model and Random Forest Regression model represents the extensive application of the All of Us, in integrating EHR and patient surveys to enhance precision medicine. And the inclusion of SDoH clustering labels revealed the significant impact of sociobehavioral factors on patient outcomes, emphasizing the importance of comprehensive health determinants in predictive models. Despite these advancements, limitations include the exclusion of genetic data, broad categorization of CVD conditions, and the need for fairness analyses to ensure equitable model performance across diverse populations. Future work should refine clinical and social variable measurements, incorporate advanced imputation techniques, and explore additional predictive algorithms to enhance model precision and fairness.Conclusion: This study demonstrates the liability of the All of Us's diverse dataset in developing a multi-modality predictive model for CVD in BC survivors risk stratification in oncological survivorship. The data integration pipeline and subsequent predictive models establish a methodological foundation for future research into personalized healthcare.","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":"2800-2810"},"PeriodicalIF":4.7,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11631116/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141767875","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0