{"title":"Clinician, patient, and organizational perspectives on ambient AI scribes.","authors":"Suzanne Bakken","doi":"10.1093/jamia/ocaf231","DOIUrl":"10.1093/jamia/ocaf231","url":null,"abstract":"","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":"33 2","pages":"253-255"},"PeriodicalIF":4.6,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12844586/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146068393","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Objective: Electronic health records (EHRs) lack a widely adopted standard for recording transgender and gender diverse (TGD) status, complicating research on TGD health. Computational models have been developed to identify TGD individuals in EHRs; however, gaps remain in understanding which components contribute to stronger phenotyping approaches. This scoping review evaluates EHR-based models for identifying TGD individuals, focusing on identifier types, performance, external validation, and ethical reporting to guide best practices.
Materials and methods: We searched PubMed, CINAHL, Web of Science, and Embase for peer-reviewed articles published before January 2024, following PRISMA-ScR guidelines. Included studies used EHR data to identify TGD individuals, verified TGD status, reported or allowed calculation of positive predictive value (PPV), and listed identifiers. Two authors screened and extracted data. We categorized models by data type and logic (structured, unstructured, and multimodal), summarized PPV distributions, and synthesized author-reported ethical considerations.
Results: Fourteen studies describing 50 models met inclusion criteria. Models using TGD-related diagnostic codes alone (n = 11) or requiring both structured and unstructured data (n = 6) showed the highest mean PPVs (85.3% and 97.1%). Models validated on larger confirmed TGD cohorts reported more stable performance, but external validation was rare. Most studies minimally addressed ethics; only 3 described protective measures or stakeholder engagement.
Discussion: Phenotyping of TGD individuals in EHR data remains heterogeneous in design and ethical transparency. Reported PPVs should be interpreted cautiously, as performance is influenced by study design, sample size, and verification methods.
Conclusions: Our recommendations emphasize the components that strengthen phenotyping approaches-identifier choice, multimodal intersection logic, validation practices, and ethical safeguards-rather than endorsing any single model.
目的:电子健康档案(Electronic health records, EHRs)缺乏广泛采用的跨性别和性别多样性(TGD)状态记录标准,使TGD健康研究复杂化。已经开发了计算模型来识别电子病历中的TGD个体;然而,在了解哪些成分有助于更强的表型方法方面仍然存在差距。此范围审查评估了用于识别TGD个体的基于ehr的模型,重点关注标识符类型、性能、外部验证和道德报告,以指导最佳实践。材料和方法:我们按照PRISMA-ScR指南,检索PubMed、CINAHL、Web of Science和Embase,检索2024年1月之前发表的同行评议文章。纳入的研究使用EHR数据来识别TGD个体,验证TGD状态,报告或允许计算阳性预测值(PPV),并列出标识符。两位作者筛选和提取数据。我们根据数据类型和逻辑(结构化、非结构化和多模态)对模型进行了分类,总结了PPV分布,并综合了作者报告的伦理考虑。结果:14项研究描述50个模型符合纳入标准。单独使用tgd相关诊断代码(n = 11)或同时需要结构化和非结构化数据(n = 6)的模型显示最高的平均ppv(85.3%和97.1%)。在更大的TGD队列中验证的模型报告了更稳定的性能,但外部验证很少。大多数研究很少涉及伦理问题;只有3个描述了保护措施或利益相关者参与。讨论:电子病历数据中TGD个体的表型在设计和伦理透明度方面仍然存在异质性。报告的ppv应谨慎解释,因为性能受研究设计、样本量和验证方法的影响。结论:我们的建议强调加强表型方法的组成部分——标识符选择、多模态交叉逻辑、验证实践和伦理保障——而不是支持任何单一模型。
{"title":"A scoping review of models to identify transgender patients in electronic health records.","authors":"Robert A Becker, Jhansi U L Kolli, Colin G Walsh","doi":"10.1093/jamia/ocaf185","DOIUrl":"10.1093/jamia/ocaf185","url":null,"abstract":"<p><strong>Objective: </strong>Electronic health records (EHRs) lack a widely adopted standard for recording transgender and gender diverse (TGD) status, complicating research on TGD health. Computational models have been developed to identify TGD individuals in EHRs; however, gaps remain in understanding which components contribute to stronger phenotyping approaches. This scoping review evaluates EHR-based models for identifying TGD individuals, focusing on identifier types, performance, external validation, and ethical reporting to guide best practices.</p><p><strong>Materials and methods: </strong>We searched PubMed, CINAHL, Web of Science, and Embase for peer-reviewed articles published before January 2024, following PRISMA-ScR guidelines. Included studies used EHR data to identify TGD individuals, verified TGD status, reported or allowed calculation of positive predictive value (PPV), and listed identifiers. Two authors screened and extracted data. We categorized models by data type and logic (structured, unstructured, and multimodal), summarized PPV distributions, and synthesized author-reported ethical considerations.</p><p><strong>Results: </strong>Fourteen studies describing 50 models met inclusion criteria. Models using TGD-related diagnostic codes alone (n = 11) or requiring both structured and unstructured data (n = 6) showed the highest mean PPVs (85.3% and 97.1%). Models validated on larger confirmed TGD cohorts reported more stable performance, but external validation was rare. Most studies minimally addressed ethics; only 3 described protective measures or stakeholder engagement.</p><p><strong>Discussion: </strong>Phenotyping of TGD individuals in EHR data remains heterogeneous in design and ethical transparency. Reported PPVs should be interpreted cautiously, as performance is influenced by study design, sample size, and verification methods.</p><p><strong>Conclusions: </strong>Our recommendations emphasize the components that strengthen phenotyping approaches-identifier choice, multimodal intersection logic, validation practices, and ethical safeguards-rather than endorsing any single model.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":"472-483"},"PeriodicalIF":4.6,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12844585/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145446502","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Colin M Rogerson, Christopher W Bartlett, John Price, Lang Li, Eneida A Mendonca, Shaun Grannis
Introduction: We created a probabilistic maternal-child electronic health record (EHR) linkage algorithm to promote clinical research in maternal-child health.
Methods: We used EHR data from 1994 to 2024 to create an XGBoost model to predict maternal-child linkages. The model used standard EHR elements as predictor variables, including first name, last name, birthdate, address, phone number, email, and an EHR-embedded maternal-child indicator as the deterministic outcome.
Results: From 82 million unique records, 6.2 billion potential pairs met blocking criteria. Of the potential pairs, 33 364 674 contained the deterministic indicator and were used as cases, and an equal number of controls were randomly sampled. The final model obtained an accuracy of 92%, a precision of 98%, a recall of 87%, and an F1-score of 92%.
Conclusion: We derived and validated a probabilistic maternal-child linkage algorithm using routinely collected EHR data elements that could benefit future observational research in maternal-child health.
{"title":"Derivation and validation of an algorithm for maternal-child linkage in electronic health records.","authors":"Colin M Rogerson, Christopher W Bartlett, John Price, Lang Li, Eneida A Mendonca, Shaun Grannis","doi":"10.1093/jamia/ocaf177","DOIUrl":"10.1093/jamia/ocaf177","url":null,"abstract":"<p><strong>Introduction: </strong>We created a probabilistic maternal-child electronic health record (EHR) linkage algorithm to promote clinical research in maternal-child health.</p><p><strong>Methods: </strong>We used EHR data from 1994 to 2024 to create an XGBoost model to predict maternal-child linkages. The model used standard EHR elements as predictor variables, including first name, last name, birthdate, address, phone number, email, and an EHR-embedded maternal-child indicator as the deterministic outcome.</p><p><strong>Results: </strong>From 82 million unique records, 6.2 billion potential pairs met blocking criteria. Of the potential pairs, 33 364 674 contained the deterministic indicator and were used as cases, and an equal number of controls were randomly sampled. The final model obtained an accuracy of 92%, a precision of 98%, a recall of 87%, and an F1-score of 92%.</p><p><strong>Conclusion: </strong>We derived and validated a probabilistic maternal-child linkage algorithm using routinely collected EHR data elements that could benefit future observational research in maternal-child health.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":"451-456"},"PeriodicalIF":4.6,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12844568/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145304351","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Kevin B Johnson, Basam Alasaly, Kuk Jin Jang, Eric Eaton, Sriharsha Mopidevi, Ross Koppel
Objective: To support ambulatory care innovation, we created Observer, a multimodal dataset comprising videotaped outpatient visits, electronic health record (EHR) data, and structured surveys. This paper describes the data collection procedures and summarizes the clinical and contextual features of the dataset.
Materials and methods: A multistakeholder steering group shaped recruitment strategies, survey design, and privacy-preserving design. Consented patients and primary care providers (PCPs) were recorded using room-view and egocentric cameras. EHR data, metadata, and audit logs were also captured. A custom de-identification pipeline, combining transcript redaction, voice masking, and facial blurring, ensured video and EHR HIPAA compliance.
Results: We report on the first 100 visits in this continually growing dataset. Thirteen PCPs from 4 clinics participated. Recording the first 100 visits required approaching 210 patients, from which 129 consented (61%), with 29 patients missing their scheduled encounter after consenting. Visit lengths ranged from 5 to 100 minutes, covering preventive care to chronic disease management. Survey responses revealed high satisfaction: 4.24/5 (patients) and 3.94/5 (PCPs). Visit experience was unaffected by the presence of video recording technology.
Discussion: We demonstrate the feasibility of capturing rich, real-world primary care interactions using scalable, privacy-sensitive methods. Room layout and camera placement were key influences on recorded communication and are now added to the dataset. The Observer dataset enables future clinical AI research/development, communication studies, and informatics education among public and private user groups.
Conclusion: Observer is a new, shareable, real-world clinic encounter research and teaching resource with a representative sample of adult primary care data.
{"title":"Observer: creation of a novel multimodal dataset for outpatient care research.","authors":"Kevin B Johnson, Basam Alasaly, Kuk Jin Jang, Eric Eaton, Sriharsha Mopidevi, Ross Koppel","doi":"10.1093/jamia/ocaf182","DOIUrl":"10.1093/jamia/ocaf182","url":null,"abstract":"<p><strong>Objective: </strong>To support ambulatory care innovation, we created Observer, a multimodal dataset comprising videotaped outpatient visits, electronic health record (EHR) data, and structured surveys. This paper describes the data collection procedures and summarizes the clinical and contextual features of the dataset.</p><p><strong>Materials and methods: </strong>A multistakeholder steering group shaped recruitment strategies, survey design, and privacy-preserving design. Consented patients and primary care providers (PCPs) were recorded using room-view and egocentric cameras. EHR data, metadata, and audit logs were also captured. A custom de-identification pipeline, combining transcript redaction, voice masking, and facial blurring, ensured video and EHR HIPAA compliance.</p><p><strong>Results: </strong>We report on the first 100 visits in this continually growing dataset. Thirteen PCPs from 4 clinics participated. Recording the first 100 visits required approaching 210 patients, from which 129 consented (61%), with 29 patients missing their scheduled encounter after consenting. Visit lengths ranged from 5 to 100 minutes, covering preventive care to chronic disease management. Survey responses revealed high satisfaction: 4.24/5 (patients) and 3.94/5 (PCPs). Visit experience was unaffected by the presence of video recording technology.</p><p><strong>Discussion: </strong>We demonstrate the feasibility of capturing rich, real-world primary care interactions using scalable, privacy-sensitive methods. Room layout and camera placement were key influences on recorded communication and are now added to the dataset. The Observer dataset enables future clinical AI research/development, communication studies, and informatics education among public and private user groups.</p><p><strong>Conclusion: </strong>Observer is a new, shareable, real-world clinic encounter research and teaching resource with a representative sample of adult primary care data.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":"424-433"},"PeriodicalIF":4.6,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12844583/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145379301","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Aileen P Wright, Carolynn K Nall, Jacob J H Franklin, Sara N Horst, Yaa A Kumah-Crystal, Adam T Wright, Dara E Mize
Objectives: To report on the feasibility of a simultaneous, enterprise-wide deployment of EHR-integrated ambient scribe technology across a large academic health system.
Materials and methods: On January 15, 2025, ambient scribing was made available to over 2400 ambulatory and emergency department clinicians. We tracked utilization rates, technical support needs, and user feedback.
Results: By March 31, 2025, 20.1% of visit notes incorporated ambient scribing, and 1223 clinicians had used ambient scribing. Among 209 respondents (22.1% of 947 surveyed), 90.9% would be disappointed if they lost access to ambient scribing, and 84.7% reported a positive training experience.
Discussion: Enterprise-wide simultaneous deployment combined with a low-barrier training model enabled immediate access for clinicians and reduced administrative burden by concentrating go-live efforts. Support needs were manageable.
Conclusion: Simultaneous enterprise-wide deployment of ambient scribing was feasible and provided immediate access for clinicians.
{"title":"Enterprise-wide simultaneous deployment of ambient scribe technology: lessons learned from an academic health system.","authors":"Aileen P Wright, Carolynn K Nall, Jacob J H Franklin, Sara N Horst, Yaa A Kumah-Crystal, Adam T Wright, Dara E Mize","doi":"10.1093/jamia/ocaf186","DOIUrl":"10.1093/jamia/ocaf186","url":null,"abstract":"<p><strong>Objectives: </strong>To report on the feasibility of a simultaneous, enterprise-wide deployment of EHR-integrated ambient scribe technology across a large academic health system.</p><p><strong>Materials and methods: </strong>On January 15, 2025, ambient scribing was made available to over 2400 ambulatory and emergency department clinicians. We tracked utilization rates, technical support needs, and user feedback.</p><p><strong>Results: </strong>By March 31, 2025, 20.1% of visit notes incorporated ambient scribing, and 1223 clinicians had used ambient scribing. Among 209 respondents (22.1% of 947 surveyed), 90.9% would be disappointed if they lost access to ambient scribing, and 84.7% reported a positive training experience.</p><p><strong>Discussion: </strong>Enterprise-wide simultaneous deployment combined with a low-barrier training model enabled immediate access for clinicians and reduced administrative burden by concentrating go-live efforts. Support needs were manageable.</p><p><strong>Conclusion: </strong>Simultaneous enterprise-wide deployment of ambient scribing was feasible and provided immediate access for clinicians.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":"457-461"},"PeriodicalIF":4.6,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12844588/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145426580","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Kelly Mohe Li, Jenna Marie Reps, Akihiko Nishimura, Martijn J Schuemie, Marc A Suchard
Objective: To develop a transfer-learning Bayesian sparse logistic regression model that transfers information learned from one dataset to another by using an informed prior to facilitate model fitting in small-sample clinical patient-level prediction problems that suffer from a lack of available information.
Methods: We propose a Bayesian framework for prediction using logistic regression that aims to conduct transfer-learning on regression coefficient information from a larger dataset model (order 105-106 patients by 105 features) into a small-sample model (order 103 patients). Our approach imposes an informed, hierarchical prior on each regression coefficient defined as a discrete mixture of the Bayesian Bridge shrinkage prior and an informed normal distribution. Performance of the informed model is compared against traditional methods, primarily measured by area under the curve, calibration, bias, and sparsity using both simulations and a real-world problem.
Results: Across all experiments, transfer-learning outperformed the traditional L1-regularized model across discrimination, calibration, bias, and sparsity. In fact, even using only a continuous shrinkage prior without the informed prior increased model performance when compared to L1-regularization.
Conclusion: Transfer-learning using informed priors can help fine-tune prediction models in small datasets suffering from a lack of information. One large benefit is in that the prior is not dependent on patient-level information, such that we can conduct transfer-learning without violating privacy. In future work, the model can be applied for learning between disparate databases, or similar lack-of-information cases such as rare outcome prediction.
{"title":"Transfer-learning on federated observational healthcare data for prediction models using Bayesian sparse logistic regression with informed priors.","authors":"Kelly Mohe Li, Jenna Marie Reps, Akihiko Nishimura, Martijn J Schuemie, Marc A Suchard","doi":"10.1093/jamia/ocaf146","DOIUrl":"10.1093/jamia/ocaf146","url":null,"abstract":"<p><strong>Objective: </strong>To develop a transfer-learning Bayesian sparse logistic regression model that transfers information learned from one dataset to another by using an informed prior to facilitate model fitting in small-sample clinical patient-level prediction problems that suffer from a lack of available information.</p><p><strong>Methods: </strong>We propose a Bayesian framework for prediction using logistic regression that aims to conduct transfer-learning on regression coefficient information from a larger dataset model (order 105-106 patients by 105 features) into a small-sample model (order 103 patients). Our approach imposes an informed, hierarchical prior on each regression coefficient defined as a discrete mixture of the Bayesian Bridge shrinkage prior and an informed normal distribution. Performance of the informed model is compared against traditional methods, primarily measured by area under the curve, calibration, bias, and sparsity using both simulations and a real-world problem.</p><p><strong>Results: </strong>Across all experiments, transfer-learning outperformed the traditional L1-regularized model across discrimination, calibration, bias, and sparsity. In fact, even using only a continuous shrinkage prior without the informed prior increased model performance when compared to L1-regularization.</p><p><strong>Conclusion: </strong>Transfer-learning using informed priors can help fine-tune prediction models in small datasets suffering from a lack of information. One large benefit is in that the prior is not dependent on patient-level information, such that we can conduct transfer-learning without violating privacy. In future work, the model can be applied for learning between disparate databases, or similar lack-of-information cases such as rare outcome prediction.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":"409-423"},"PeriodicalIF":4.6,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12844582/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145379288","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Austin Eliazar, James Thomas Brown, Sara Cinamon, Murat Kantarcioglu, Bradley Malin
Objective: Privacy preserving record linkage (PPRL) refers to techniques used to identify which records refer to the same person across disparate datasets while safeguarding their identities. PPRL is increasingly relied upon to facilitate biomedical research. A common strategy encodes personally identifying information for comparison without disclosing underlying identifiers. As the scale of research datasets expands, it becomes crucial to reassess the privacy risks associated with these encodings. This paper highlights the potential re-identification risks of some of these encodings, demonstrating an attack that exploits encoding repetition across patients.
Materials and methods: The attack leverages repeated PPRL encoding values combined with common demographics shared during PPRL in the clear (e.g., 3-digit ZIP code) to distinguish encodings from one another and ultimately link them to identities in a reference dataset. Using US Census statistics and voter registries, we empirically estimate encodings' re-identification risk against such an attack, while varying multiple factors that influence the risk.
Results: Re-identification risk for PPRL encodings increases with population size, number of distinct encodings per patient, and amount of demographic information available. Commonly used encodings typically grow from <1% re-identification rate for datasets under one million individuals to 10%-20% for 250 million individuals.
Discussion and conclusion: Re-identification risk often remains low in smaller populations, but increases significantly at the larger scales increasingly encountered today. These risks are common in many PPRL implementations, although, as our work shows, they are avoidable. Choosing better tokens or matching tokens through a third party without the underlying demographics effectively eliminates these risks.
{"title":"Re-identification risk for common privacy preserving patient matching strategies when shared with de-identified demographics.","authors":"Austin Eliazar, James Thomas Brown, Sara Cinamon, Murat Kantarcioglu, Bradley Malin","doi":"10.1093/jamia/ocaf183","DOIUrl":"10.1093/jamia/ocaf183","url":null,"abstract":"<p><strong>Objective: </strong>Privacy preserving record linkage (PPRL) refers to techniques used to identify which records refer to the same person across disparate datasets while safeguarding their identities. PPRL is increasingly relied upon to facilitate biomedical research. A common strategy encodes personally identifying information for comparison without disclosing underlying identifiers. As the scale of research datasets expands, it becomes crucial to reassess the privacy risks associated with these encodings. This paper highlights the potential re-identification risks of some of these encodings, demonstrating an attack that exploits encoding repetition across patients.</p><p><strong>Materials and methods: </strong>The attack leverages repeated PPRL encoding values combined with common demographics shared during PPRL in the clear (e.g., 3-digit ZIP code) to distinguish encodings from one another and ultimately link them to identities in a reference dataset. Using US Census statistics and voter registries, we empirically estimate encodings' re-identification risk against such an attack, while varying multiple factors that influence the risk.</p><p><strong>Results: </strong>Re-identification risk for PPRL encodings increases with population size, number of distinct encodings per patient, and amount of demographic information available. Commonly used encodings typically grow from <1% re-identification rate for datasets under one million individuals to 10%-20% for 250 million individuals.</p><p><strong>Discussion and conclusion: </strong>Re-identification risk often remains low in smaller populations, but increases significantly at the larger scales increasingly encountered today. These risks are common in many PPRL implementations, although, as our work shows, they are avoidable. Choosing better tokens or matching tokens through a third party without the underlying demographics effectively eliminates these risks.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":"336-346"},"PeriodicalIF":4.6,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12844594/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145314065","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yiming Li, Xueqing Peng, Suyuan Peng, Jianfu Li, Donghong Pei, Qin Zhang, Yiwei Lu, Yan Hu, Fang Li, Li Zhou, Yongqun He, Cui Tao, Hua Xu, Na Hong
Background: Acupuncture, a key modality in traditional Chinese medicine, is gaining global recognition as a complementary therapy and a subject of increasing scientific interest. However, fragmented and unstructured acupuncture knowledge spread across diverse sources poses challenges for semantic retrieval, reasoning, and in-depth analysis. To address this gap, we developed AcuKG, a comprehensive knowledge graph that systematically organizes acupuncture-related knowledge to support sharing, discovery, and artificial intelligence-driven innovation in the field.
Methods: AcuKG integrates data from multiple sources, including online resources, guidelines, PubMed literature, ClinicalTrials.gov, and multiple ontologies (SNOMED CT, UBERON, and MeSH). We employed entity recognition, relation extraction, and ontology mapping to establish AcuKG, with human-in-the-loop to ensure data quality. Two cases evaluated AcuKG's usability: (1) how AcuKG advances acupuncture research for obesity and (2) how AcuKG enhances large language model (LLM) application on acupuncture question-answering.
Results: AcuKG comprises 1839 entities and 11 527 relations, mapped to 1836 standard concepts in 3 ontologies. Two use cases demonstrated AcuKG's effectiveness and potential in advancing acupuncture research and supporting LLM applications. In the obesity use case, AcuKG identified highly relevant acupoints (eg, ST25, ST36) and uncovered novel research insights based on evidence from clinical trials and literature. When applied to LLMs in answering acupuncture-related questions, integrating AcuKG with GPT-4o and LLaMA 3 significantly improved accuracy (GPT-4o: 46% → 54%, P = .03; LLaMA 3: 17% → 28%, P = .01).
Conclusion: AcuKG is an open dataset that provides a structured and computational framework for acupuncture applications, bridging traditional practices with acupuncture research and cutting-edge LLM technologies.
{"title":"AcuKG: a comprehensive knowledge graph for medical acupuncture.","authors":"Yiming Li, Xueqing Peng, Suyuan Peng, Jianfu Li, Donghong Pei, Qin Zhang, Yiwei Lu, Yan Hu, Fang Li, Li Zhou, Yongqun He, Cui Tao, Hua Xu, Na Hong","doi":"10.1093/jamia/ocaf179","DOIUrl":"10.1093/jamia/ocaf179","url":null,"abstract":"<p><strong>Background: </strong>Acupuncture, a key modality in traditional Chinese medicine, is gaining global recognition as a complementary therapy and a subject of increasing scientific interest. However, fragmented and unstructured acupuncture knowledge spread across diverse sources poses challenges for semantic retrieval, reasoning, and in-depth analysis. To address this gap, we developed AcuKG, a comprehensive knowledge graph that systematically organizes acupuncture-related knowledge to support sharing, discovery, and artificial intelligence-driven innovation in the field.</p><p><strong>Methods: </strong>AcuKG integrates data from multiple sources, including online resources, guidelines, PubMed literature, ClinicalTrials.gov, and multiple ontologies (SNOMED CT, UBERON, and MeSH). We employed entity recognition, relation extraction, and ontology mapping to establish AcuKG, with human-in-the-loop to ensure data quality. Two cases evaluated AcuKG's usability: (1) how AcuKG advances acupuncture research for obesity and (2) how AcuKG enhances large language model (LLM) application on acupuncture question-answering.</p><p><strong>Results: </strong>AcuKG comprises 1839 entities and 11 527 relations, mapped to 1836 standard concepts in 3 ontologies. Two use cases demonstrated AcuKG's effectiveness and potential in advancing acupuncture research and supporting LLM applications. In the obesity use case, AcuKG identified highly relevant acupoints (eg, ST25, ST36) and uncovered novel research insights based on evidence from clinical trials and literature. When applied to LLMs in answering acupuncture-related questions, integrating AcuKG with GPT-4o and LLaMA 3 significantly improved accuracy (GPT-4o: 46% → 54%, P = .03; LLaMA 3: 17% → 28%, P = .01).</p><p><strong>Conclusion: </strong>AcuKG is an open dataset that provides a structured and computational framework for acupuncture applications, bridging traditional practices with acupuncture research and cutting-edge LLM technologies.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":"359-370"},"PeriodicalIF":4.6,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12844574/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145349547","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Michelle H Nguyen, Carolyn D Applegate, Brittney Murray, Ayah Zirikly, Crystal Tichnell, Catherine Gordon, Lisa R Yanek, Cynthia A James, Casey Overby Taylor
Objective: To build natural language processing (NLP) strategies to characterize measures of genetic counseling (GC) efficiency and classify measures according to phase of GC (pre- or post-genetic testing).
Materials and methods: This study selected and annotated 800 GC notes from 7 clinical specialties in a large academic medical center for NLP model development and validation. The NLP approaches extracted GC efficiency measures, including direct and indirect time and GC phase. The models were then applied to 24 102 GC notes collected from January 2016 through December 2023.
Results: NLP approaches performed well (F1 scores of 0.95 and 0.90 for direct time in GC and GC phase classification, respectively). Our findings showed median direct time in GC of 50 minutes, with significant differences in direct time distributions observed across clinical specialties, time periods (2016-2019 or 2020-2023), delivery modes (in person or telehealth), and GC phase.
Discussion: As referrals to GC increase, there is increasing pressure to improve efficiency. Our NLP strategy was used to generate and summarize real-world evidence of GC time for 7 clinical specialties. These approaches enable future research on the impact of interventions intended to improve GC efficiency.
Conclusion: This work demonstrated the practical value of NLP to provide a useful and scalable strategy to generate real world evidence of GC efficiency. Principles presented in this work may also be valuable for health services research in other practice areas.
{"title":"Assessing genetic counseling efficiency with natural language processing.","authors":"Michelle H Nguyen, Carolyn D Applegate, Brittney Murray, Ayah Zirikly, Crystal Tichnell, Catherine Gordon, Lisa R Yanek, Cynthia A James, Casey Overby Taylor","doi":"10.1093/jamia/ocaf190","DOIUrl":"10.1093/jamia/ocaf190","url":null,"abstract":"<p><strong>Objective: </strong>To build natural language processing (NLP) strategies to characterize measures of genetic counseling (GC) efficiency and classify measures according to phase of GC (pre- or post-genetic testing).</p><p><strong>Materials and methods: </strong>This study selected and annotated 800 GC notes from 7 clinical specialties in a large academic medical center for NLP model development and validation. The NLP approaches extracted GC efficiency measures, including direct and indirect time and GC phase. The models were then applied to 24 102 GC notes collected from January 2016 through December 2023.</p><p><strong>Results: </strong>NLP approaches performed well (F1 scores of 0.95 and 0.90 for direct time in GC and GC phase classification, respectively). Our findings showed median direct time in GC of 50 minutes, with significant differences in direct time distributions observed across clinical specialties, time periods (2016-2019 or 2020-2023), delivery modes (in person or telehealth), and GC phase.</p><p><strong>Discussion: </strong>As referrals to GC increase, there is increasing pressure to improve efficiency. Our NLP strategy was used to generate and summarize real-world evidence of GC time for 7 clinical specialties. These approaches enable future research on the impact of interventions intended to improve GC efficiency.</p><p><strong>Conclusion: </strong>This work demonstrated the practical value of NLP to provide a useful and scalable strategy to generate real world evidence of GC efficiency. Principles presented in this work may also be valuable for health services research in other practice areas.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":"295-303"},"PeriodicalIF":4.6,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12743353/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145483649","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Laura K Wiley, Luke V Rasmussen, Rebecca T Levinson, Jennnifer Malinowski, Sheila M Manemann, Melissa P Wilson, Martin Chapman, Jennifer A Pacheco, Theresa L Walunas, Justin B Starren, Suzette J Bielinski, Rachel L Richesson
Background: Computational phenotyping from electronic health records (EHRs) is essential for clinical research, decision support, and quality/population health assessment, but the proliferation of algorithms for the same conditions makes it difficult to identify which algorithm is most appropriate for reuse.
Objective: To develop a framework for assessing phenotyping algorithm fitness for purpose and reuse.
Fitness for purpose: Phenotyping algorithms are fit for purpose when they identify the intended population with performance characteristics appropriate for the intended application.
Fitness for reuse: Phenotyping algorithms are fit for reuse when the algorithm is implementable and generalizable-that is, it identifies the same intended population with similar performance characteristics when applied to a new setting.
Conclusions: The PhenoFit framework provides a structured approach to evaluate and adapt phenotyping algorithms for new contexts increasing efficiency and consistency of identifying patient populations from EHRs.
{"title":"PhenoFit: a framework for determining computable phenotyping algorithm fitness for purpose and reuse.","authors":"Laura K Wiley, Luke V Rasmussen, Rebecca T Levinson, Jennnifer Malinowski, Sheila M Manemann, Melissa P Wilson, Martin Chapman, Jennifer A Pacheco, Theresa L Walunas, Justin B Starren, Suzette J Bielinski, Rachel L Richesson","doi":"10.1093/jamia/ocaf195","DOIUrl":"10.1093/jamia/ocaf195","url":null,"abstract":"<p><strong>Background: </strong>Computational phenotyping from electronic health records (EHRs) is essential for clinical research, decision support, and quality/population health assessment, but the proliferation of algorithms for the same conditions makes it difficult to identify which algorithm is most appropriate for reuse.</p><p><strong>Objective: </strong>To develop a framework for assessing phenotyping algorithm fitness for purpose and reuse.</p><p><strong>Fitness for purpose: </strong>Phenotyping algorithms are fit for purpose when they identify the intended population with performance characteristics appropriate for the intended application.</p><p><strong>Fitness for reuse: </strong>Phenotyping algorithms are fit for reuse when the algorithm is implementable and generalizable-that is, it identifies the same intended population with similar performance characteristics when applied to a new setting.</p><p><strong>Conclusions: </strong>The PhenoFit framework provides a structured approach to evaluate and adapt phenotyping algorithms for new contexts increasing efficiency and consistency of identifying patient populations from EHRs.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":"536-542"},"PeriodicalIF":4.6,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12844593/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145507875","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}