首页 > 最新文献

Journal of the American Medical Informatics Association最新文献

英文 中文
Clinician, patient, and organizational perspectives on ambient AI scribes. 临床医生、患者和组织对环境人工智能记录仪的看法。
IF 4.6 2区 医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-02-01 DOI: 10.1093/jamia/ocaf231
Suzanne Bakken
{"title":"Clinician, patient, and organizational perspectives on ambient AI scribes.","authors":"Suzanne Bakken","doi":"10.1093/jamia/ocaf231","DOIUrl":"10.1093/jamia/ocaf231","url":null,"abstract":"","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":"33 2","pages":"253-255"},"PeriodicalIF":4.6,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12844586/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146068393","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A scoping review of models to identify transgender patients in electronic health records. 电子健康记录中识别跨性别患者模型的范围审查。
IF 4.6 2区 医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-02-01 DOI: 10.1093/jamia/ocaf185
Robert A Becker, Jhansi U L Kolli, Colin G Walsh

Objective: Electronic health records (EHRs) lack a widely adopted standard for recording transgender and gender diverse (TGD) status, complicating research on TGD health. Computational models have been developed to identify TGD individuals in EHRs; however, gaps remain in understanding which components contribute to stronger phenotyping approaches. This scoping review evaluates EHR-based models for identifying TGD individuals, focusing on identifier types, performance, external validation, and ethical reporting to guide best practices.

Materials and methods: We searched PubMed, CINAHL, Web of Science, and Embase for peer-reviewed articles published before January 2024, following PRISMA-ScR guidelines. Included studies used EHR data to identify TGD individuals, verified TGD status, reported or allowed calculation of positive predictive value (PPV), and listed identifiers. Two authors screened and extracted data. We categorized models by data type and logic (structured, unstructured, and multimodal), summarized PPV distributions, and synthesized author-reported ethical considerations.

Results: Fourteen studies describing 50 models met inclusion criteria. Models using TGD-related diagnostic codes alone (n = 11) or requiring both structured and unstructured data (n = 6) showed the highest mean PPVs (85.3% and 97.1%). Models validated on larger confirmed TGD cohorts reported more stable performance, but external validation was rare. Most studies minimally addressed ethics; only 3 described protective measures or stakeholder engagement.

Discussion: Phenotyping of TGD individuals in EHR data remains heterogeneous in design and ethical transparency. Reported PPVs should be interpreted cautiously, as performance is influenced by study design, sample size, and verification methods.

Conclusions: Our recommendations emphasize the components that strengthen phenotyping approaches-identifier choice, multimodal intersection logic, validation practices, and ethical safeguards-rather than endorsing any single model.

目的:电子健康档案(Electronic health records, EHRs)缺乏广泛采用的跨性别和性别多样性(TGD)状态记录标准,使TGD健康研究复杂化。已经开发了计算模型来识别电子病历中的TGD个体;然而,在了解哪些成分有助于更强的表型方法方面仍然存在差距。此范围审查评估了用于识别TGD个体的基于ehr的模型,重点关注标识符类型、性能、外部验证和道德报告,以指导最佳实践。材料和方法:我们按照PRISMA-ScR指南,检索PubMed、CINAHL、Web of Science和Embase,检索2024年1月之前发表的同行评议文章。纳入的研究使用EHR数据来识别TGD个体,验证TGD状态,报告或允许计算阳性预测值(PPV),并列出标识符。两位作者筛选和提取数据。我们根据数据类型和逻辑(结构化、非结构化和多模态)对模型进行了分类,总结了PPV分布,并综合了作者报告的伦理考虑。结果:14项研究描述50个模型符合纳入标准。单独使用tgd相关诊断代码(n = 11)或同时需要结构化和非结构化数据(n = 6)的模型显示最高的平均ppv(85.3%和97.1%)。在更大的TGD队列中验证的模型报告了更稳定的性能,但外部验证很少。大多数研究很少涉及伦理问题;只有3个描述了保护措施或利益相关者参与。讨论:电子病历数据中TGD个体的表型在设计和伦理透明度方面仍然存在异质性。报告的ppv应谨慎解释,因为性能受研究设计、样本量和验证方法的影响。结论:我们的建议强调加强表型方法的组成部分——标识符选择、多模态交叉逻辑、验证实践和伦理保障——而不是支持任何单一模型。
{"title":"A scoping review of models to identify transgender patients in electronic health records.","authors":"Robert A Becker, Jhansi U L Kolli, Colin G Walsh","doi":"10.1093/jamia/ocaf185","DOIUrl":"10.1093/jamia/ocaf185","url":null,"abstract":"<p><strong>Objective: </strong>Electronic health records (EHRs) lack a widely adopted standard for recording transgender and gender diverse (TGD) status, complicating research on TGD health. Computational models have been developed to identify TGD individuals in EHRs; however, gaps remain in understanding which components contribute to stronger phenotyping approaches. This scoping review evaluates EHR-based models for identifying TGD individuals, focusing on identifier types, performance, external validation, and ethical reporting to guide best practices.</p><p><strong>Materials and methods: </strong>We searched PubMed, CINAHL, Web of Science, and Embase for peer-reviewed articles published before January 2024, following PRISMA-ScR guidelines. Included studies used EHR data to identify TGD individuals, verified TGD status, reported or allowed calculation of positive predictive value (PPV), and listed identifiers. Two authors screened and extracted data. We categorized models by data type and logic (structured, unstructured, and multimodal), summarized PPV distributions, and synthesized author-reported ethical considerations.</p><p><strong>Results: </strong>Fourteen studies describing 50 models met inclusion criteria. Models using TGD-related diagnostic codes alone (n = 11) or requiring both structured and unstructured data (n = 6) showed the highest mean PPVs (85.3% and 97.1%). Models validated on larger confirmed TGD cohorts reported more stable performance, but external validation was rare. Most studies minimally addressed ethics; only 3 described protective measures or stakeholder engagement.</p><p><strong>Discussion: </strong>Phenotyping of TGD individuals in EHR data remains heterogeneous in design and ethical transparency. Reported PPVs should be interpreted cautiously, as performance is influenced by study design, sample size, and verification methods.</p><p><strong>Conclusions: </strong>Our recommendations emphasize the components that strengthen phenotyping approaches-identifier choice, multimodal intersection logic, validation practices, and ethical safeguards-rather than endorsing any single model.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":"472-483"},"PeriodicalIF":4.6,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12844585/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145446502","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Derivation and validation of an algorithm for maternal-child linkage in electronic health records. 电子健康记录中母婴联动算法的推导与验证。
IF 4.6 2区 医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-02-01 DOI: 10.1093/jamia/ocaf177
Colin M Rogerson, Christopher W Bartlett, John Price, Lang Li, Eneida A Mendonca, Shaun Grannis

Introduction: We created a probabilistic maternal-child electronic health record (EHR) linkage algorithm to promote clinical research in maternal-child health.

Methods: We used EHR data from 1994 to 2024 to create an XGBoost model to predict maternal-child linkages. The model used standard EHR elements as predictor variables, including first name, last name, birthdate, address, phone number, email, and an EHR-embedded maternal-child indicator as the deterministic outcome.

Results: From 82 million unique records, 6.2 billion potential pairs met blocking criteria. Of the potential pairs, 33 364 674 contained the deterministic indicator and were used as cases, and an equal number of controls were randomly sampled. The final model obtained an accuracy of 92%, a precision of 98%, a recall of 87%, and an F1-score of 92%.

Conclusion: We derived and validated a probabilistic maternal-child linkage algorithm using routinely collected EHR data elements that could benefit future observational research in maternal-child health.

为了促进母婴健康的临床研究,我们创建了一个概率母婴电子健康记录(EHR)联动算法。方法:利用1994 ~ 2024年的电子病历数据,建立XGBoost模型预测母婴联系。该模型使用标准的电子病历元素作为预测变量,包括名字、姓氏、出生日期、地址、电话号码、电子邮件,以及嵌入电子病历的母婴指标作为确定性结果。结果:从8200万条唯一记录中,有62亿个潜在的对符合阻断标准。在潜在对中,有33 364 674对含有确定性指标作为病例,并随机抽取相同数量的对照。最终模型的准确率为92%,精密度为98%,召回率为87%,f1得分为92%。结论:我们使用常规收集的电子病历数据元素推导并验证了一种概率母婴关联算法,该算法可用于未来的母婴健康观察研究。
{"title":"Derivation and validation of an algorithm for maternal-child linkage in electronic health records.","authors":"Colin M Rogerson, Christopher W Bartlett, John Price, Lang Li, Eneida A Mendonca, Shaun Grannis","doi":"10.1093/jamia/ocaf177","DOIUrl":"10.1093/jamia/ocaf177","url":null,"abstract":"<p><strong>Introduction: </strong>We created a probabilistic maternal-child electronic health record (EHR) linkage algorithm to promote clinical research in maternal-child health.</p><p><strong>Methods: </strong>We used EHR data from 1994 to 2024 to create an XGBoost model to predict maternal-child linkages. The model used standard EHR elements as predictor variables, including first name, last name, birthdate, address, phone number, email, and an EHR-embedded maternal-child indicator as the deterministic outcome.</p><p><strong>Results: </strong>From 82 million unique records, 6.2 billion potential pairs met blocking criteria. Of the potential pairs, 33 364 674 contained the deterministic indicator and were used as cases, and an equal number of controls were randomly sampled. The final model obtained an accuracy of 92%, a precision of 98%, a recall of 87%, and an F1-score of 92%.</p><p><strong>Conclusion: </strong>We derived and validated a probabilistic maternal-child linkage algorithm using routinely collected EHR data elements that could benefit future observational research in maternal-child health.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":"451-456"},"PeriodicalIF":4.6,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12844568/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145304351","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Observer: creation of a novel multimodal dataset for outpatient care research. 观察者:为门诊护理研究创建一个新的多模态数据集。
IF 4.6 2区 医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-02-01 DOI: 10.1093/jamia/ocaf182
Kevin B Johnson, Basam Alasaly, Kuk Jin Jang, Eric Eaton, Sriharsha Mopidevi, Ross Koppel

Objective: To support ambulatory care innovation, we created Observer, a multimodal dataset comprising videotaped outpatient visits, electronic health record (EHR) data, and structured surveys. This paper describes the data collection procedures and summarizes the clinical and contextual features of the dataset.

Materials and methods: A multistakeholder steering group shaped recruitment strategies, survey design, and privacy-preserving design. Consented patients and primary care providers (PCPs) were recorded using room-view and egocentric cameras. EHR data, metadata, and audit logs were also captured. A custom de-identification pipeline, combining transcript redaction, voice masking, and facial blurring, ensured video and EHR HIPAA compliance.

Results: We report on the first 100 visits in this continually growing dataset. Thirteen PCPs from 4 clinics participated. Recording the first 100 visits required approaching 210 patients, from which 129 consented (61%), with 29 patients missing their scheduled encounter after consenting. Visit lengths ranged from 5 to 100 minutes, covering preventive care to chronic disease management. Survey responses revealed high satisfaction: 4.24/5 (patients) and 3.94/5 (PCPs). Visit experience was unaffected by the presence of video recording technology.

Discussion: We demonstrate the feasibility of capturing rich, real-world primary care interactions using scalable, privacy-sensitive methods. Room layout and camera placement were key influences on recorded communication and are now added to the dataset. The Observer dataset enables future clinical AI research/development, communication studies, and informatics education among public and private user groups.

Conclusion: Observer is a new, shareable, real-world clinic encounter research and teaching resource with a representative sample of adult primary care data.

目的:为了支持门诊护理创新,我们创建了Observer,这是一个多模式数据集,包括门诊就诊录像、电子健康记录(EHR)数据和结构化调查。本文描述了数据收集过程,并总结了数据集的临床和上下文特征。材料和方法:一个多利益相关者指导小组塑造了招聘策略、调查设计和隐私保护设计。使用房间视图和自我中心摄像机记录同意的患者和初级保健提供者(pcp)。还捕获了EHR数据、元数据和审计日志。自定义的去识别管道,结合了抄本编辑、语音屏蔽和面部模糊,确保了视频和EHR符合HIPAA。结果:我们在这个不断增长的数据集中报告前100次访问。来自4个诊所的13名初级医师参与。记录前100次就诊需要接近210名患者,其中129名患者同意(61%),29名患者在同意后错过了预定的就诊。就诊时间从5分钟到100分钟不等,包括预防保健到慢性病管理。调查结果显示满意度较高:4.24/5(患者)和3.94/5 (pcp)。参观体验不受录像技术的影响。讨论:我们展示了使用可扩展的、隐私敏感的方法捕获丰富的、真实世界的初级保健交互的可行性。房间布局和摄像机位置是记录通信的关键影响因素,现在被添加到数据集中。观察者数据集使未来的临床人工智能研究/开发、传播研究和公共和私人用户群体之间的信息学教育成为可能。结论:Observer是一个新的、可共享的、真实世界的临床研究和教学资源,具有代表性的成人初级保健数据样本。
{"title":"Observer: creation of a novel multimodal dataset for outpatient care research.","authors":"Kevin B Johnson, Basam Alasaly, Kuk Jin Jang, Eric Eaton, Sriharsha Mopidevi, Ross Koppel","doi":"10.1093/jamia/ocaf182","DOIUrl":"10.1093/jamia/ocaf182","url":null,"abstract":"<p><strong>Objective: </strong>To support ambulatory care innovation, we created Observer, a multimodal dataset comprising videotaped outpatient visits, electronic health record (EHR) data, and structured surveys. This paper describes the data collection procedures and summarizes the clinical and contextual features of the dataset.</p><p><strong>Materials and methods: </strong>A multistakeholder steering group shaped recruitment strategies, survey design, and privacy-preserving design. Consented patients and primary care providers (PCPs) were recorded using room-view and egocentric cameras. EHR data, metadata, and audit logs were also captured. A custom de-identification pipeline, combining transcript redaction, voice masking, and facial blurring, ensured video and EHR HIPAA compliance.</p><p><strong>Results: </strong>We report on the first 100 visits in this continually growing dataset. Thirteen PCPs from 4 clinics participated. Recording the first 100 visits required approaching 210 patients, from which 129 consented (61%), with 29 patients missing their scheduled encounter after consenting. Visit lengths ranged from 5 to 100 minutes, covering preventive care to chronic disease management. Survey responses revealed high satisfaction: 4.24/5 (patients) and 3.94/5 (PCPs). Visit experience was unaffected by the presence of video recording technology.</p><p><strong>Discussion: </strong>We demonstrate the feasibility of capturing rich, real-world primary care interactions using scalable, privacy-sensitive methods. Room layout and camera placement were key influences on recorded communication and are now added to the dataset. The Observer dataset enables future clinical AI research/development, communication studies, and informatics education among public and private user groups.</p><p><strong>Conclusion: </strong>Observer is a new, shareable, real-world clinic encounter research and teaching resource with a representative sample of adult primary care data.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":"424-433"},"PeriodicalIF":4.6,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12844583/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145379301","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Enterprise-wide simultaneous deployment of ambient scribe technology: lessons learned from an academic health system. 企业范围内环境抄写技术的同时部署:从学术卫生系统吸取的经验教训。
IF 4.6 2区 医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-02-01 DOI: 10.1093/jamia/ocaf186
Aileen P Wright, Carolynn K Nall, Jacob J H Franklin, Sara N Horst, Yaa A Kumah-Crystal, Adam T Wright, Dara E Mize

Objectives: To report on the feasibility of a simultaneous, enterprise-wide deployment of EHR-integrated ambient scribe technology across a large academic health system.

Materials and methods: On January 15, 2025, ambient scribing was made available to over 2400 ambulatory and emergency department clinicians. We tracked utilization rates, technical support needs, and user feedback.

Results: By March 31, 2025, 20.1% of visit notes incorporated ambient scribing, and 1223 clinicians had used ambient scribing. Among 209 respondents (22.1% of 947 surveyed), 90.9% would be disappointed if they lost access to ambient scribing, and 84.7% reported a positive training experience.

Discussion: Enterprise-wide simultaneous deployment combined with a low-barrier training model enabled immediate access for clinicians and reduced administrative burden by concentrating go-live efforts. Support needs were manageable.

Conclusion: Simultaneous enterprise-wide deployment of ambient scribing was feasible and provided immediate access for clinicians.

目的:报告在大型学术卫生系统中同时在企业范围内部署ehr集成环境抄写器技术的可行性。材料和方法:2025年1月15日,2400多名门诊和急诊科临床医生可以使用环境涂片。我们跟踪了使用率、技术支持需求和用户反馈。结果:截至2025年3月31日,20.1%的病历采用环境记录,1223名临床医生使用环境记录。在209名受访者(947名受访者中的22.1%)中,90.9%的人表示,如果他们无法获得环境记录,他们会感到失望,84.7%的人报告了积极的培训经历。讨论:企业范围内的同步部署与低障碍培训模型相结合,使临床医生能够立即访问,并通过集中工作减少管理负担。支持需求是可控的。结论:同时在企业范围内部署环境涂写是可行的,并为临床医生提供了即时访问。
{"title":"Enterprise-wide simultaneous deployment of ambient scribe technology: lessons learned from an academic health system.","authors":"Aileen P Wright, Carolynn K Nall, Jacob J H Franklin, Sara N Horst, Yaa A Kumah-Crystal, Adam T Wright, Dara E Mize","doi":"10.1093/jamia/ocaf186","DOIUrl":"10.1093/jamia/ocaf186","url":null,"abstract":"<p><strong>Objectives: </strong>To report on the feasibility of a simultaneous, enterprise-wide deployment of EHR-integrated ambient scribe technology across a large academic health system.</p><p><strong>Materials and methods: </strong>On January 15, 2025, ambient scribing was made available to over 2400 ambulatory and emergency department clinicians. We tracked utilization rates, technical support needs, and user feedback.</p><p><strong>Results: </strong>By March 31, 2025, 20.1% of visit notes incorporated ambient scribing, and 1223 clinicians had used ambient scribing. Among 209 respondents (22.1% of 947 surveyed), 90.9% would be disappointed if they lost access to ambient scribing, and 84.7% reported a positive training experience.</p><p><strong>Discussion: </strong>Enterprise-wide simultaneous deployment combined with a low-barrier training model enabled immediate access for clinicians and reduced administrative burden by concentrating go-live efforts. Support needs were manageable.</p><p><strong>Conclusion: </strong>Simultaneous enterprise-wide deployment of ambient scribing was feasible and provided immediate access for clinicians.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":"457-461"},"PeriodicalIF":4.6,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12844588/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145426580","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Transfer-learning on federated observational healthcare data for prediction models using Bayesian sparse logistic regression with informed priors. 使用具有知情先验的贝叶斯稀疏逻辑回归对联邦观察医疗保健数据的预测模型的迁移学习。
IF 4.6 2区 医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-02-01 DOI: 10.1093/jamia/ocaf146
Kelly Mohe Li, Jenna Marie Reps, Akihiko Nishimura, Martijn J Schuemie, Marc A Suchard

Objective: To develop a transfer-learning Bayesian sparse logistic regression model that transfers information learned from one dataset to another by using an informed prior to facilitate model fitting in small-sample clinical patient-level prediction problems that suffer from a lack of available information.

Methods: We propose a Bayesian framework for prediction using logistic regression that aims to conduct transfer-learning on regression coefficient information from a larger dataset model (order 105-106 patients by 105 features) into a small-sample model (order 103 patients). Our approach imposes an informed, hierarchical prior on each regression coefficient defined as a discrete mixture of the Bayesian Bridge shrinkage prior and an informed normal distribution. Performance of the informed model is compared against traditional methods, primarily measured by area under the curve, calibration, bias, and sparsity using both simulations and a real-world problem.

Results: Across all experiments, transfer-learning outperformed the traditional L1-regularized model across discrimination, calibration, bias, and sparsity. In fact, even using only a continuous shrinkage prior without the informed prior increased model performance when compared to L1-regularization.

Conclusion: Transfer-learning using informed priors can help fine-tune prediction models in small datasets suffering from a lack of information. One large benefit is in that the prior is not dependent on patient-level information, such that we can conduct transfer-learning without violating privacy. In future work, the model can be applied for learning between disparate databases, or similar lack-of-information cases such as rare outcome prediction.

目的:开发一种迁移学习贝叶斯稀疏逻辑回归模型,该模型通过使用知情先验将从一个数据集学习到的信息转移到另一个数据集,以促进模型拟合,以解决缺乏可用信息的小样本临床患者水平预测问题。方法:我们提出了一个使用逻辑回归进行预测的贝叶斯框架,旨在将回归系数信息从更大的数据集模型(105-106个患者的105个特征)转移到小样本模型(103个患者)。我们的方法对定义为贝叶斯桥收缩先验和知情正态分布的离散混合物的每个回归系数施加了一个知情的分层先验。通过模拟和实际问题,将信息模型的性能与传统方法进行比较,主要通过曲线下面积、校准、偏差和稀疏度来测量。结果:在所有实验中,迁移学习在辨别、校准、偏差和稀疏性方面优于传统的l1正则化模型。事实上,与l1正则化相比,即使只使用连续收缩先验而不使用知情先验也会提高模型性能。结论:使用知情先验的迁移学习可以帮助在缺乏信息的小数据集中微调预测模型。一个很大的好处是先验不依赖于患者层面的信息,这样我们就可以在不侵犯隐私的情况下进行迁移学习。在未来的工作中,该模型可以应用于不同数据库之间的学习,或者类似的缺乏信息的情况,如罕见的结果预测。
{"title":"Transfer-learning on federated observational healthcare data for prediction models using Bayesian sparse logistic regression with informed priors.","authors":"Kelly Mohe Li, Jenna Marie Reps, Akihiko Nishimura, Martijn J Schuemie, Marc A Suchard","doi":"10.1093/jamia/ocaf146","DOIUrl":"10.1093/jamia/ocaf146","url":null,"abstract":"<p><strong>Objective: </strong>To develop a transfer-learning Bayesian sparse logistic regression model that transfers information learned from one dataset to another by using an informed prior to facilitate model fitting in small-sample clinical patient-level prediction problems that suffer from a lack of available information.</p><p><strong>Methods: </strong>We propose a Bayesian framework for prediction using logistic regression that aims to conduct transfer-learning on regression coefficient information from a larger dataset model (order 105-106 patients by 105 features) into a small-sample model (order 103 patients). Our approach imposes an informed, hierarchical prior on each regression coefficient defined as a discrete mixture of the Bayesian Bridge shrinkage prior and an informed normal distribution. Performance of the informed model is compared against traditional methods, primarily measured by area under the curve, calibration, bias, and sparsity using both simulations and a real-world problem.</p><p><strong>Results: </strong>Across all experiments, transfer-learning outperformed the traditional L1-regularized model across discrimination, calibration, bias, and sparsity. In fact, even using only a continuous shrinkage prior without the informed prior increased model performance when compared to L1-regularization.</p><p><strong>Conclusion: </strong>Transfer-learning using informed priors can help fine-tune prediction models in small datasets suffering from a lack of information. One large benefit is in that the prior is not dependent on patient-level information, such that we can conduct transfer-learning without violating privacy. In future work, the model can be applied for learning between disparate databases, or similar lack-of-information cases such as rare outcome prediction.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":"409-423"},"PeriodicalIF":4.6,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12844582/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145379288","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Re-identification risk for common privacy preserving patient matching strategies when shared with de-identified demographics. 当与去识别的人口统计数据共享时,共同隐私保护患者匹配策略的重新识别风险。
IF 4.6 2区 医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-02-01 DOI: 10.1093/jamia/ocaf183
Austin Eliazar, James Thomas Brown, Sara Cinamon, Murat Kantarcioglu, Bradley Malin

Objective: Privacy preserving record linkage (PPRL) refers to techniques used to identify which records refer to the same person across disparate datasets while safeguarding their identities. PPRL is increasingly relied upon to facilitate biomedical research. A common strategy encodes personally identifying information for comparison without disclosing underlying identifiers. As the scale of research datasets expands, it becomes crucial to reassess the privacy risks associated with these encodings. This paper highlights the potential re-identification risks of some of these encodings, demonstrating an attack that exploits encoding repetition across patients.

Materials and methods: The attack leverages repeated PPRL encoding values combined with common demographics shared during PPRL in the clear (e.g., 3-digit ZIP code) to distinguish encodings from one another and ultimately link them to identities in a reference dataset. Using US Census statistics and voter registries, we empirically estimate encodings' re-identification risk against such an attack, while varying multiple factors that influence the risk.

Results: Re-identification risk for PPRL encodings increases with population size, number of distinct encodings per patient, and amount of demographic information available. Commonly used encodings typically grow from <1% re-identification rate for datasets under one million individuals to 10%-20% for 250 million individuals.

Discussion and conclusion: Re-identification risk often remains low in smaller populations, but increases significantly at the larger scales increasingly encountered today. These risks are common in many PPRL implementations, although, as our work shows, they are avoidable. Choosing better tokens or matching tokens through a third party without the underlying demographics effectively eliminates these risks.

目的:隐私保护记录链接(PPRL)是指用于识别哪些记录涉及不同数据集中的同一个人,同时保护其身份的技术。PPRL越来越多地用于促进生物医学研究。一种常见的策略是对个人标识信息进行编码,以便在不泄露底层标识符的情况下进行比较。随着研究数据集规模的扩大,重新评估与这些编码相关的隐私风险变得至关重要。本文强调了其中一些编码的潜在重新识别风险,展示了一种利用患者之间编码重复的攻击。材料和方法:攻击利用重复的PPRL编码值与PPRL期间共享的公共人口统计数据(例如,3位数的邮政编码)来区分编码,并最终将它们链接到参考数据集中的身份。使用美国人口普查统计数据和选民登记,我们在改变影响风险的多个因素的同时,对这种攻击的编码重新识别风险进行了经验估计。结果:PPRL编码的再识别风险随着人群规模、每位患者不同编码的数量和可获得的人口统计信息的数量而增加。常用的编码通常是从讨论和结论中得出的:在较小的人群中,重新识别的风险通常仍然很低,但在今天日益遇到的更大范围中,风险会显著增加。这些风险在许多PPRL实现中是常见的,尽管,正如我们的工作所示,它们是可以避免的。选择更好的代币或通过第三方匹配代币,而不需要潜在的人口统计数据,有效地消除了这些风险。
{"title":"Re-identification risk for common privacy preserving patient matching strategies when shared with de-identified demographics.","authors":"Austin Eliazar, James Thomas Brown, Sara Cinamon, Murat Kantarcioglu, Bradley Malin","doi":"10.1093/jamia/ocaf183","DOIUrl":"10.1093/jamia/ocaf183","url":null,"abstract":"<p><strong>Objective: </strong>Privacy preserving record linkage (PPRL) refers to techniques used to identify which records refer to the same person across disparate datasets while safeguarding their identities. PPRL is increasingly relied upon to facilitate biomedical research. A common strategy encodes personally identifying information for comparison without disclosing underlying identifiers. As the scale of research datasets expands, it becomes crucial to reassess the privacy risks associated with these encodings. This paper highlights the potential re-identification risks of some of these encodings, demonstrating an attack that exploits encoding repetition across patients.</p><p><strong>Materials and methods: </strong>The attack leverages repeated PPRL encoding values combined with common demographics shared during PPRL in the clear (e.g., 3-digit ZIP code) to distinguish encodings from one another and ultimately link them to identities in a reference dataset. Using US Census statistics and voter registries, we empirically estimate encodings' re-identification risk against such an attack, while varying multiple factors that influence the risk.</p><p><strong>Results: </strong>Re-identification risk for PPRL encodings increases with population size, number of distinct encodings per patient, and amount of demographic information available. Commonly used encodings typically grow from <1% re-identification rate for datasets under one million individuals to 10%-20% for 250 million individuals.</p><p><strong>Discussion and conclusion: </strong>Re-identification risk often remains low in smaller populations, but increases significantly at the larger scales increasingly encountered today. These risks are common in many PPRL implementations, although, as our work shows, they are avoidable. Choosing better tokens or matching tokens through a third party without the underlying demographics effectively eliminates these risks.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":"336-346"},"PeriodicalIF":4.6,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12844594/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145314065","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
AcuKG: a comprehensive knowledge graph for medical acupuncture. AcuKG:医学针灸的综合知识图谱。
IF 4.6 2区 医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-02-01 DOI: 10.1093/jamia/ocaf179
Yiming Li, Xueqing Peng, Suyuan Peng, Jianfu Li, Donghong Pei, Qin Zhang, Yiwei Lu, Yan Hu, Fang Li, Li Zhou, Yongqun He, Cui Tao, Hua Xu, Na Hong

Background: Acupuncture, a key modality in traditional Chinese medicine, is gaining global recognition as a complementary therapy and a subject of increasing scientific interest. However, fragmented and unstructured acupuncture knowledge spread across diverse sources poses challenges for semantic retrieval, reasoning, and in-depth analysis. To address this gap, we developed AcuKG, a comprehensive knowledge graph that systematically organizes acupuncture-related knowledge to support sharing, discovery, and artificial intelligence-driven innovation in the field.

Methods: AcuKG integrates data from multiple sources, including online resources, guidelines, PubMed literature, ClinicalTrials.gov, and multiple ontologies (SNOMED CT, UBERON, and MeSH). We employed entity recognition, relation extraction, and ontology mapping to establish AcuKG, with human-in-the-loop to ensure data quality. Two cases evaluated AcuKG's usability: (1) how AcuKG advances acupuncture research for obesity and (2) how AcuKG enhances large language model (LLM) application on acupuncture question-answering.

Results: AcuKG comprises 1839 entities and 11 527 relations, mapped to 1836 standard concepts in 3 ontologies. Two use cases demonstrated AcuKG's effectiveness and potential in advancing acupuncture research and supporting LLM applications. In the obesity use case, AcuKG identified highly relevant acupoints (eg, ST25, ST36) and uncovered novel research insights based on evidence from clinical trials and literature. When applied to LLMs in answering acupuncture-related questions, integrating AcuKG with GPT-4o and LLaMA 3 significantly improved accuracy (GPT-4o: 46% → 54%, P = .03; LLaMA 3: 17% → 28%, P = .01).

Conclusion: AcuKG is an open dataset that provides a structured and computational framework for acupuncture applications, bridging traditional practices with acupuncture research and cutting-edge LLM technologies.

背景:针灸作为中国传统医学的一种重要治疗方式,作为一种辅助疗法正在获得全球的认可,并日益受到科学关注。然而,分散在不同来源的零散和非结构化的针灸知识给语义检索、推理和深入分析带来了挑战。为了解决这一差距,我们开发了AcuKG,这是一个全面的知识图谱,系统地组织针灸相关知识,以支持该领域的共享、发现和人工智能驱动的创新。方法:AcuKG整合了来自多个来源的数据,包括在线资源、指南、PubMed文献、ClinicalTrials.gov和多个本体(SNOMED CT、UBERON和MeSH)。我们采用实体识别、关系提取、本体映射等方法建立AcuKG, human-in-the-loop保证数据质量。两个案例评估了AcuKG的可用性:(1)AcuKG如何推进肥胖针灸研究;(2)AcuKG如何增强大语言模型(large language model, LLM)在针灸问答中的应用。结果:AcuKG包含1839个实体和11527个关系,映射到3个本体中的1836个标准概念。两个用例证明了AcuKG在推进针灸研究和支持法学硕士应用方面的有效性和潜力。在肥胖用例中,AcuKG确定了高度相关的穴位(例如,ST25, ST36),并根据临床试验和文献的证据发现了新的研究见解。将AcuKG与gpt - 40和LLaMA 3结合应用于LLMs回答针灸相关问题时,准确率显著提高(gpt - 40: 46%→54%,P = 0.03; LLaMA 3: 17%→28%,P = 0.01)。结论:AcuKG是一个开放的数据集,为针灸应用提供了结构化和计算框架,将传统实践与针灸研究和前沿LLM技术联系起来。
{"title":"AcuKG: a comprehensive knowledge graph for medical acupuncture.","authors":"Yiming Li, Xueqing Peng, Suyuan Peng, Jianfu Li, Donghong Pei, Qin Zhang, Yiwei Lu, Yan Hu, Fang Li, Li Zhou, Yongqun He, Cui Tao, Hua Xu, Na Hong","doi":"10.1093/jamia/ocaf179","DOIUrl":"10.1093/jamia/ocaf179","url":null,"abstract":"<p><strong>Background: </strong>Acupuncture, a key modality in traditional Chinese medicine, is gaining global recognition as a complementary therapy and a subject of increasing scientific interest. However, fragmented and unstructured acupuncture knowledge spread across diverse sources poses challenges for semantic retrieval, reasoning, and in-depth analysis. To address this gap, we developed AcuKG, a comprehensive knowledge graph that systematically organizes acupuncture-related knowledge to support sharing, discovery, and artificial intelligence-driven innovation in the field.</p><p><strong>Methods: </strong>AcuKG integrates data from multiple sources, including online resources, guidelines, PubMed literature, ClinicalTrials.gov, and multiple ontologies (SNOMED CT, UBERON, and MeSH). We employed entity recognition, relation extraction, and ontology mapping to establish AcuKG, with human-in-the-loop to ensure data quality. Two cases evaluated AcuKG's usability: (1) how AcuKG advances acupuncture research for obesity and (2) how AcuKG enhances large language model (LLM) application on acupuncture question-answering.</p><p><strong>Results: </strong>AcuKG comprises 1839 entities and 11 527 relations, mapped to 1836 standard concepts in 3 ontologies. Two use cases demonstrated AcuKG's effectiveness and potential in advancing acupuncture research and supporting LLM applications. In the obesity use case, AcuKG identified highly relevant acupoints (eg, ST25, ST36) and uncovered novel research insights based on evidence from clinical trials and literature. When applied to LLMs in answering acupuncture-related questions, integrating AcuKG with GPT-4o and LLaMA 3 significantly improved accuracy (GPT-4o: 46% → 54%, P = .03; LLaMA 3: 17% → 28%, P = .01).</p><p><strong>Conclusion: </strong>AcuKG is an open dataset that provides a structured and computational framework for acupuncture applications, bridging traditional practices with acupuncture research and cutting-edge LLM technologies.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":"359-370"},"PeriodicalIF":4.6,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12844574/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145349547","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Assessing genetic counseling efficiency with natural language processing. 用自然语言处理评估遗传咨询的效率。
IF 4.6 2区 医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-02-01 DOI: 10.1093/jamia/ocaf190
Michelle H Nguyen, Carolyn D Applegate, Brittney Murray, Ayah Zirikly, Crystal Tichnell, Catherine Gordon, Lisa R Yanek, Cynthia A James, Casey Overby Taylor

Objective: To build natural language processing (NLP) strategies to characterize measures of genetic counseling (GC) efficiency and classify measures according to phase of GC (pre- or post-genetic testing).

Materials and methods: This study selected and annotated 800 GC notes from 7 clinical specialties in a large academic medical center for NLP model development and validation. The NLP approaches extracted GC efficiency measures, including direct and indirect time and GC phase. The models were then applied to 24 102 GC notes collected from January 2016 through December 2023.

Results: NLP approaches performed well (F1 scores of 0.95 and 0.90 for direct time in GC and GC phase classification, respectively). Our findings showed median direct time in GC of 50 minutes, with significant differences in direct time distributions observed across clinical specialties, time periods (2016-2019 or 2020-2023), delivery modes (in person or telehealth), and GC phase.

Discussion: As referrals to GC increase, there is increasing pressure to improve efficiency. Our NLP strategy was used to generate and summarize real-world evidence of GC time for 7 clinical specialties. These approaches enable future research on the impact of interventions intended to improve GC efficiency.

Conclusion: This work demonstrated the practical value of NLP to provide a useful and scalable strategy to generate real world evidence of GC efficiency. Principles presented in this work may also be valuable for health services research in other practice areas.

目的:建立自然语言处理(NLP)策略来表征遗传咨询(GC)效率的测度,并根据GC的阶段(基因检测前或基因检测后)对测度进行分类。材料和方法:本研究选取某大型学术医学中心7个临床专科的800份GC笔记进行注释,用于NLP模型的开发和验证。NLP方法提取气相色谱效率指标,包括直接和间接时间和气相色谱阶段。然后将这些模型应用于2016年1月至2023年12月收集的24102张纸币。结果:NLP方法在GC和GC相分类中表现良好(直接时间F1得分分别为0.95和0.90)。我们的研究结果显示,GC的中位直接时间为50分钟,在临床专科、时间段(2016-2019年或2020-2023年)、交付模式(亲自或远程医疗)和GC阶段的直接时间分布存在显著差异。讨论:随着对GC的引用增加,提高效率的压力也越来越大。我们的NLP策略用于生成和总结7个临床专科GC时间的真实证据。这些方法使未来研究旨在提高GC效率的干预措施的影响成为可能。结论:这项工作证明了NLP的实用价值,它提供了一种有用的、可扩展的策略来生成GC效率的真实世界证据。这项工作中提出的原则也可能对其他实践领域的卫生服务研究有价值。
{"title":"Assessing genetic counseling efficiency with natural language processing.","authors":"Michelle H Nguyen, Carolyn D Applegate, Brittney Murray, Ayah Zirikly, Crystal Tichnell, Catherine Gordon, Lisa R Yanek, Cynthia A James, Casey Overby Taylor","doi":"10.1093/jamia/ocaf190","DOIUrl":"10.1093/jamia/ocaf190","url":null,"abstract":"<p><strong>Objective: </strong>To build natural language processing (NLP) strategies to characterize measures of genetic counseling (GC) efficiency and classify measures according to phase of GC (pre- or post-genetic testing).</p><p><strong>Materials and methods: </strong>This study selected and annotated 800 GC notes from 7 clinical specialties in a large academic medical center for NLP model development and validation. The NLP approaches extracted GC efficiency measures, including direct and indirect time and GC phase. The models were then applied to 24 102 GC notes collected from January 2016 through December 2023.</p><p><strong>Results: </strong>NLP approaches performed well (F1 scores of 0.95 and 0.90 for direct time in GC and GC phase classification, respectively). Our findings showed median direct time in GC of 50 minutes, with significant differences in direct time distributions observed across clinical specialties, time periods (2016-2019 or 2020-2023), delivery modes (in person or telehealth), and GC phase.</p><p><strong>Discussion: </strong>As referrals to GC increase, there is increasing pressure to improve efficiency. Our NLP strategy was used to generate and summarize real-world evidence of GC time for 7 clinical specialties. These approaches enable future research on the impact of interventions intended to improve GC efficiency.</p><p><strong>Conclusion: </strong>This work demonstrated the practical value of NLP to provide a useful and scalable strategy to generate real world evidence of GC efficiency. Principles presented in this work may also be valuable for health services research in other practice areas.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":"295-303"},"PeriodicalIF":4.6,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12743353/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145483649","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
PhenoFit: a framework for determining computable phenotyping algorithm fitness for purpose and reuse. PhenoFit:一个框架,用于确定可计算的表型算法适合的目的和重用。
IF 4.6 2区 医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2026-02-01 DOI: 10.1093/jamia/ocaf195
Laura K Wiley, Luke V Rasmussen, Rebecca T Levinson, Jennnifer Malinowski, Sheila M Manemann, Melissa P Wilson, Martin Chapman, Jennifer A Pacheco, Theresa L Walunas, Justin B Starren, Suzette J Bielinski, Rachel L Richesson

Background: Computational phenotyping from electronic health records (EHRs) is essential for clinical research, decision support, and quality/population health assessment, but the proliferation of algorithms for the same conditions makes it difficult to identify which algorithm is most appropriate for reuse.

Objective: To develop a framework for assessing phenotyping algorithm fitness for purpose and reuse.

Fitness for purpose: Phenotyping algorithms are fit for purpose when they identify the intended population with performance characteristics appropriate for the intended application.

Fitness for reuse: Phenotyping algorithms are fit for reuse when the algorithm is implementable and generalizable-that is, it identifies the same intended population with similar performance characteristics when applied to a new setting.

Conclusions: The PhenoFit framework provides a structured approach to evaluate and adapt phenotyping algorithms for new contexts increasing efficiency and consistency of identifying patient populations from EHRs.

背景:来自电子健康记录(EHRs)的计算表型对于临床研究、决策支持和质量/人群健康评估至关重要,但针对相同条件的算法的激增使得难以确定哪种算法最适合重用。目的:开发一个框架,评估表型算法适合的目的和重用。适合目的:当表型算法确定具有适合预期应用程序的性能特征的预期种群时,它们是适合目的的。适合重用:当表现型算法具有可实现性和通用性时,表现型算法就适合重用——也就是说,当应用于新设置时,它可以识别具有相似性能特征的相同预期种群。结论:PhenoFit框架提供了一种结构化的方法来评估和调整表型算法,以适应新的环境,从而提高了从电子病历中识别患者群体的效率和一致性。
{"title":"PhenoFit: a framework for determining computable phenotyping algorithm fitness for purpose and reuse.","authors":"Laura K Wiley, Luke V Rasmussen, Rebecca T Levinson, Jennnifer Malinowski, Sheila M Manemann, Melissa P Wilson, Martin Chapman, Jennifer A Pacheco, Theresa L Walunas, Justin B Starren, Suzette J Bielinski, Rachel L Richesson","doi":"10.1093/jamia/ocaf195","DOIUrl":"10.1093/jamia/ocaf195","url":null,"abstract":"<p><strong>Background: </strong>Computational phenotyping from electronic health records (EHRs) is essential for clinical research, decision support, and quality/population health assessment, but the proliferation of algorithms for the same conditions makes it difficult to identify which algorithm is most appropriate for reuse.</p><p><strong>Objective: </strong>To develop a framework for assessing phenotyping algorithm fitness for purpose and reuse.</p><p><strong>Fitness for purpose: </strong>Phenotyping algorithms are fit for purpose when they identify the intended population with performance characteristics appropriate for the intended application.</p><p><strong>Fitness for reuse: </strong>Phenotyping algorithms are fit for reuse when the algorithm is implementable and generalizable-that is, it identifies the same intended population with similar performance characteristics when applied to a new setting.</p><p><strong>Conclusions: </strong>The PhenoFit framework provides a structured approach to evaluate and adapt phenotyping algorithms for new contexts increasing efficiency and consistency of identifying patient populations from EHRs.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":"536-542"},"PeriodicalIF":4.6,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12844593/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145507875","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Journal of the American Medical Informatics Association
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1