Pub Date : 2025-01-31DOI: 10.1038/s41746-025-01471-y
Elena Fountzilas, Tillman Pearce, Mehmet A. Baysal, Abhijit Chakraborty, Apostolia M. Tsimberidou
The confluence of new technologies with artificial intelligence (AI) and machine learning (ML) analytical techniques is rapidly advancing the field of precision oncology, promising to improve diagnostic approaches and therapeutic strategies for patients with cancer. By analyzing multi-dimensional, multiomic, spatial pathology, and radiomic data, these technologies enable a deeper understanding of the intricate molecular pathways, aiding in the identification of critical nodes within the tumor’s biology to optimize treatment selection. The applications of AI/ML in precision oncology are extensive and include the generation of synthetic data, e.g., digital twins, in order to provide the necessary information to design or expedite the conduct of clinical trials. Currently, many operational and technical challenges exist related to data technology, engineering, and storage; algorithm development and structures; quality and quantity of the data and the analytical pipeline; data sharing and generalizability; and the incorporation of these technologies into the current clinical workflow and reimbursement models.
{"title":"Convergence of evolving artificial intelligence and machine learning techniques in precision oncology","authors":"Elena Fountzilas, Tillman Pearce, Mehmet A. Baysal, Abhijit Chakraborty, Apostolia M. Tsimberidou","doi":"10.1038/s41746-025-01471-y","DOIUrl":"https://doi.org/10.1038/s41746-025-01471-y","url":null,"abstract":"<p>The confluence of new technologies with artificial intelligence (AI) and machine learning (ML) analytical techniques is rapidly advancing the field of precision oncology, promising to improve diagnostic approaches and therapeutic strategies for patients with cancer. By analyzing multi-dimensional, multiomic, spatial pathology, and radiomic data, these technologies enable a deeper understanding of the intricate molecular pathways, aiding in the identification of critical nodes within the tumor’s biology to optimize treatment selection. The applications of AI/ML in precision oncology are extensive and include the generation of synthetic data, e.g., digital twins, in order to provide the necessary information to design or expedite the conduct of clinical trials. Currently, many operational and technical challenges exist related to data technology, engineering, and storage; algorithm development and structures; quality and quantity of the data and the analytical pipeline; data sharing and generalizability; and the incorporation of these technologies into the current clinical workflow and reimbursement models.</p>","PeriodicalId":19349,"journal":{"name":"NPJ Digital Medicine","volume":"11 1","pages":""},"PeriodicalIF":15.2,"publicationDate":"2025-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143072439","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-30DOI: 10.1038/s41746-025-01458-9
Edwin F. Juarez, Bennet Peterson, Erica Sanford Kobayashi, Sheldon Gilmer, Laura E. Tobin, Brandan Schultz, Jerica Lenberg, Jeanne Carroll, Shiyu Bai-Tong, Nathaly M. Sweeney, Curtis Beebe, Lawrence Stewart, Lauren Olsen, Julie Reinke, Elizabeth A. Kiernan, Rebecca Reimers, Kristen Wigby, Chris Tackaberry, Mark Yandell, Charlotte Hobbs, Matthew N. Bainbridge
The Mendelian Phenotype Search Engine (MPSE), a clinical decision support tool using Natural Language Processing and Machine Learning, helped neonatologists expedite decisions to whole genome sequencing (WGS) to diagnose patients in the neonatal intensive care unit. After the MPSE was introduced, utilization of WGS increased, time to ordering WGS decreased, and WGS diagnostic yield increased.
{"title":"A machine learning decision support tool optimizes WGS utilization in a neonatal intensive care unit","authors":"Edwin F. Juarez, Bennet Peterson, Erica Sanford Kobayashi, Sheldon Gilmer, Laura E. Tobin, Brandan Schultz, Jerica Lenberg, Jeanne Carroll, Shiyu Bai-Tong, Nathaly M. Sweeney, Curtis Beebe, Lawrence Stewart, Lauren Olsen, Julie Reinke, Elizabeth A. Kiernan, Rebecca Reimers, Kristen Wigby, Chris Tackaberry, Mark Yandell, Charlotte Hobbs, Matthew N. Bainbridge","doi":"10.1038/s41746-025-01458-9","DOIUrl":"https://doi.org/10.1038/s41746-025-01458-9","url":null,"abstract":"<p>The Mendelian Phenotype Search Engine (MPSE), a clinical decision support tool using Natural Language Processing and Machine Learning, helped neonatologists expedite decisions to whole genome sequencing (WGS) to diagnose patients in the neonatal intensive care unit. After the MPSE was introduced, utilization of WGS increased, time to ordering WGS decreased, and WGS diagnostic yield increased.</p>","PeriodicalId":19349,"journal":{"name":"NPJ Digital Medicine","volume":"74 1","pages":""},"PeriodicalIF":15.2,"publicationDate":"2025-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143056763","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-30DOI: 10.1038/s41746-025-01445-0
Agustina D Saenz, Amanda Centi, David Ting, Jacqueline G You, Adam Landman, Rebecca G Mishuris
{"title":"Author Correction: Establishing responsible use of AI guidelines: a comprehensive case study for healthcare institutions.","authors":"Agustina D Saenz, Amanda Centi, David Ting, Jacqueline G You, Adam Landman, Rebecca G Mishuris","doi":"10.1038/s41746-025-01445-0","DOIUrl":"10.1038/s41746-025-01445-0","url":null,"abstract":"","PeriodicalId":19349,"journal":{"name":"NPJ Digital Medicine","volume":"8 1","pages":"70"},"PeriodicalIF":12.4,"publicationDate":"2025-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11782563/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143067004","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-30DOI: 10.1038/s41746-025-01441-4
H Trask Crane, John A Berkebile, Samer Mabrouk, Nicholas Riccardelli, Omer T Inan
Monitoring fluid intake and output for congestive heart failure (CHF) patients is an essential tool to prevent fluid overload, a principal cause of hospital admissions. Addressing this, bladder volume measurement systems utilizing bioimpedance and electrical impedance tomography have been proposed, with limited exploration of continuous monitoring within a wearable design. Advancing this format, we developed a conductivity digital twin from radiological data, where we performed exhaustive simulations to optimize electrode sensitivity on an individual basis. Our optimized placement demonstrated an efficient proof-of-concept volume estimation that required as few as seven measurement frames while maintaining low errors (CI 95% -1.11% to 1.00%) for volumes ≥100 mL. Additionally, we quantify the impact of ascites, a common confounding condition in CHF, on the bioimpedance signal. By improving monitoring technology, we aim to reduce CHF mortality by empowering patients and clinicians with a more thorough understanding of fluid status.
{"title":"Digital twin driven electrode optimization for wearable bladder monitoring via bioimpedance.","authors":"H Trask Crane, John A Berkebile, Samer Mabrouk, Nicholas Riccardelli, Omer T Inan","doi":"10.1038/s41746-025-01441-4","DOIUrl":"10.1038/s41746-025-01441-4","url":null,"abstract":"<p><p>Monitoring fluid intake and output for congestive heart failure (CHF) patients is an essential tool to prevent fluid overload, a principal cause of hospital admissions. Addressing this, bladder volume measurement systems utilizing bioimpedance and electrical impedance tomography have been proposed, with limited exploration of continuous monitoring within a wearable design. Advancing this format, we developed a conductivity digital twin from radiological data, where we performed exhaustive simulations to optimize electrode sensitivity on an individual basis. Our optimized placement demonstrated an efficient proof-of-concept volume estimation that required as few as seven measurement frames while maintaining low errors (CI 95% -1.11% to 1.00%) for volumes ≥100 mL. Additionally, we quantify the impact of ascites, a common confounding condition in CHF, on the bioimpedance signal. By improving monitoring technology, we aim to reduce CHF mortality by empowering patients and clinicians with a more thorough understanding of fluid status.</p>","PeriodicalId":19349,"journal":{"name":"NPJ Digital Medicine","volume":"8 1","pages":"73"},"PeriodicalIF":12.4,"publicationDate":"2025-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11782588/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143067006","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-30DOI: 10.1038/s41746-025-01474-9
Elizabeth J. Enichen, Kimia Heydari, Ben Li, Joseph C. Kvedar
Liu et al.’s recent study reveals that telemedicine expanded access to cardiovascular care in China, enabling patients in poorer areas of the country to access care in cities with more resources. While these findings may support the global expansion of telemedicine, implementation often proves challenging. This article examines the potential and limitations of adopting similar telemedicine efforts within the U.S. to advance geographic health equity.
{"title":"Telemedicine expands cardiovascular care in China – lessons for health equity in the United States","authors":"Elizabeth J. Enichen, Kimia Heydari, Ben Li, Joseph C. Kvedar","doi":"10.1038/s41746-025-01474-9","DOIUrl":"https://doi.org/10.1038/s41746-025-01474-9","url":null,"abstract":"Liu et al.’s recent study reveals that telemedicine expanded access to cardiovascular care in China, enabling patients in poorer areas of the country to access care in cities with more resources. While these findings may support the global expansion of telemedicine, implementation often proves challenging. This article examines the potential and limitations of adopting similar telemedicine efforts within the U.S. to advance geographic health equity.","PeriodicalId":19349,"journal":{"name":"NPJ Digital Medicine","volume":"129 1","pages":""},"PeriodicalIF":15.2,"publicationDate":"2025-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143057135","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Existing prognostic models are useful for estimating the prognosis of lung adenocarcinoma patients, but there remains room for improvement. In the current study, we developed a deep learning model based on histopathological images to predict the recurrence risk of lung adenocarcinoma patients. The efficiency of the model was then evaluated in independent multicenter cohorts. The model defined high- and low-risk groups successfully stratified prognosis of the entire cohort. Moreover, multivariable Cox analysis identified the model defined risk groups as an independent predictor for disease-free survival. Importantly, combining TNM stage with the established model helped to distinguish subgroups of patients with high-risk stage II and stage III disease who are highly likely to benefit from adjuvant chemotherapy. Overall, our study highlights the significant value of the constructed model to serve as a complementary biomarker for survival stratification and adjuvant therapy selection for lung adenocarcinoma patients after resection.
{"title":"Whole slide image based deep learning refines prognosis and therapeutic response evaluation in lung adenocarcinoma","authors":"Tao Chen, Jialiang Wen, Xinchen Shen, Jiaqi Shen, Jiajun Deng, Mengmeng Zhao, Long Xu, Chunyan Wu, Bentong Yu, Minglei Yang, Minjie Ma, Junqi Wu, Yunlang She, Yifan Zhong, Likun Hou, Yanrui Jin, Chang Chen","doi":"10.1038/s41746-025-01470-z","DOIUrl":"https://doi.org/10.1038/s41746-025-01470-z","url":null,"abstract":"<p>Existing prognostic models are useful for estimating the prognosis of lung adenocarcinoma patients, but there remains room for improvement. In the current study, we developed a deep learning model based on histopathological images to predict the recurrence risk of lung adenocarcinoma patients. The efficiency of the model was then evaluated in independent multicenter cohorts. The model defined high- and low-risk groups successfully stratified prognosis of the entire cohort. Moreover, multivariable Cox analysis identified the model defined risk groups as an independent predictor for disease-free survival. Importantly, combining TNM stage with the established model helped to distinguish subgroups of patients with high-risk stage II and stage III disease who are highly likely to benefit from adjuvant chemotherapy. Overall, our study highlights the significant value of the constructed model to serve as a complementary biomarker for survival stratification and adjuvant therapy selection for lung adenocarcinoma patients after resection.</p>","PeriodicalId":19349,"journal":{"name":"NPJ Digital Medicine","volume":"44 1","pages":""},"PeriodicalIF":15.2,"publicationDate":"2025-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143054905","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-28DOI: 10.1038/s41746-025-01468-7
Yuqi Yan, Kai Wang, Bojian Feng, Jincao Yao, Tian Jiang, Zhiyan Jin, Yin Zheng, Yahan Zhou, Chen Chen, Lin Sui, Xiayi Chen, Yanhong Du, Jie Yang, Qianmeng Pan, Lingyan Zhou, Vicky Yang Wang, Ping Liang, Dong Xu
This retrospective study evaluated the efficacy of large language models (LLMs) in improving the accuracy of Chinese ultrasound reports. Data from three hospitals (January-April 2024) including 400 reports with 243 errors across six categories were analyzed. Three GPT versions and Claude 3.5 Sonnet were tested in zero-shot settings, with the top two models further assessed in few-shot scenarios. Six radiologists of varying experience levels performed error detection on a randomly selected test set. In zero-shot setting, Claude 3.5 Sonnet and GPT-4o achieved the highest error detection rates (52.3% and 41.2%, respectively). In few-shot, Claude 3.5 Sonnet outperformed senior and resident radiologists, while GPT-4o excelled in spelling error detection. LLMs processed reports faster than the quickest radiologist (Claude 3.5 Sonnet: 13.2 s, GPT-4o: 15.0 s, radiologist: 42.0 s per report). This study demonstrates the potential of LLMs to enhance ultrasound report accuracy, outperforming human experts in certain aspects.
{"title":"The use of large language models in detecting Chinese ultrasound report errors","authors":"Yuqi Yan, Kai Wang, Bojian Feng, Jincao Yao, Tian Jiang, Zhiyan Jin, Yin Zheng, Yahan Zhou, Chen Chen, Lin Sui, Xiayi Chen, Yanhong Du, Jie Yang, Qianmeng Pan, Lingyan Zhou, Vicky Yang Wang, Ping Liang, Dong Xu","doi":"10.1038/s41746-025-01468-7","DOIUrl":"https://doi.org/10.1038/s41746-025-01468-7","url":null,"abstract":"<p>This retrospective study evaluated the efficacy of large language models (LLMs) in improving the accuracy of Chinese ultrasound reports. Data from three hospitals (January-April 2024) including 400 reports with 243 errors across six categories were analyzed. Three GPT versions and Claude 3.5 Sonnet were tested in zero-shot settings, with the top two models further assessed in few-shot scenarios. Six radiologists of varying experience levels performed error detection on a randomly selected test set. In zero-shot setting, Claude 3.5 Sonnet and GPT-4o achieved the highest error detection rates (52.3% and 41.2%, respectively). In few-shot, Claude 3.5 Sonnet outperformed senior and resident radiologists, while GPT-4o excelled in spelling error detection. LLMs processed reports faster than the quickest radiologist (Claude 3.5 Sonnet: 13.2 s, GPT-4o: 15.0 s, radiologist: 42.0 s per report). This study demonstrates the potential of LLMs to enhance ultrasound report accuracy, outperforming human experts in certain aspects.</p>","PeriodicalId":19349,"journal":{"name":"NPJ Digital Medicine","volume":"14 1","pages":""},"PeriodicalIF":15.2,"publicationDate":"2025-01-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143049638","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-28DOI: 10.1038/s41746-025-01449-w
Thalia Richter, Reut Shani, Shachaf Tal, Nazanin Derakshan, Noga Cohen, Philip M. Enock, Richard J. McNally, Nilly Mor, Shimrit Daches, Alishia D. Williams, Jenny Yiend, Per Carlbring, Jennie M. Kuckertz, Wenhui Yang, Andrea Reinecke, Christopher G. Beevers, Brian E. Bunnell, Ernst H. W. Koster, Sigal Zilcha-Mano, Hadas Okon-Singer
Cognitive training is a promising intervention for psychological distress; however, its effectiveness has yielded inconsistent outcomes across studies. This research is a pre-registered individual-level meta-analysis to identify factors contributing to cognitive training efficacy for anxiety and depression symptoms. Machine learning methods, alongside traditional statistical approaches, were employed to analyze 22 datasets with 1544 participants who underwent working memory training, attention bias modification, interpretation bias modification, or inhibitory control training. Baseline depression and anxiety symptoms were found to be the most influential factor, with individuals with more severe symptoms showing the greatest improvement. The number of training sessions was also important, with more sessions yielding greater benefits. Cognitive trainings were associated with higher predicted improvement than control conditions, with attention and interpretation bias modification showing the most promise. Despite the limitations of heterogeneous datasets, this investigation highlights the value of large-scale comprehensive analyses in guiding the development of personalized training interventions.
{"title":"Machine learning meta-analysis identifies individual characteristics moderating cognitive intervention efficacy for anxiety and depression symptoms","authors":"Thalia Richter, Reut Shani, Shachaf Tal, Nazanin Derakshan, Noga Cohen, Philip M. Enock, Richard J. McNally, Nilly Mor, Shimrit Daches, Alishia D. Williams, Jenny Yiend, Per Carlbring, Jennie M. Kuckertz, Wenhui Yang, Andrea Reinecke, Christopher G. Beevers, Brian E. Bunnell, Ernst H. W. Koster, Sigal Zilcha-Mano, Hadas Okon-Singer","doi":"10.1038/s41746-025-01449-w","DOIUrl":"https://doi.org/10.1038/s41746-025-01449-w","url":null,"abstract":"<p>Cognitive training is a promising intervention for psychological distress; however, its effectiveness has yielded inconsistent outcomes across studies. This research is a pre-registered individual-level meta-analysis to identify factors contributing to cognitive training efficacy for anxiety and depression symptoms. Machine learning methods, alongside traditional statistical approaches, were employed to analyze 22 datasets with 1544 participants who underwent working memory training, attention bias modification, interpretation bias modification, or inhibitory control training. Baseline depression and anxiety symptoms were found to be the most influential factor, with individuals with more severe symptoms showing the greatest improvement. The number of training sessions was also important, with more sessions yielding greater benefits. Cognitive trainings were associated with higher predicted improvement than control conditions, with attention and interpretation bias modification showing the most promise. Despite the limitations of heterogeneous datasets, this investigation highlights the value of large-scale comprehensive analyses in guiding the development of personalized training interventions.</p>","PeriodicalId":19349,"journal":{"name":"NPJ Digital Medicine","volume":"47 1","pages":""},"PeriodicalIF":15.2,"publicationDate":"2025-01-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143049984","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Accurately predicting binding affinities between drugs and targets is crucial for drug discovery but remains challenging due to the complexity of modeling interactions between small drug and large targets. This study proposes DMFF-DTA, a dual-modality neural network model integrates sequence and graph structure information from drugs and proteins for drug-target affinity prediction. The model introduces a binding site-focused graph construction approach to extract binding information, enabling more balanced and efficient modeling of drug-target interactions. Comprehensive experiments demonstrate DMFF-DTA outperforms state-of-the-art methods with significant improvements. The model exhibits excellent generalization capabilities on completely unseen drugs and targets, achieving an improvement of over 8% compared to existing methods. Model interpretability analysis validates the biological relevance of the model. A case study in pancreatic cancer drug repurposing demonstrates its practical utility. This work provides an interpretable, robust approach to integrate multi-view drug and protein features for advancing computational drug discovery.
{"title":"Dual modality feature fused neural network integrating binding site information for drug target affinity prediction","authors":"Haohuai He, Guanxing Chen, Zhenchao Tang, Calvin Yu-Chian Chen","doi":"10.1038/s41746-025-01464-x","DOIUrl":"https://doi.org/10.1038/s41746-025-01464-x","url":null,"abstract":"<p>Accurately predicting binding affinities between drugs and targets is crucial for drug discovery but remains challenging due to the complexity of modeling interactions between small drug and large targets. This study proposes DMFF-DTA, a dual-modality neural network model integrates sequence and graph structure information from drugs and proteins for drug-target affinity prediction. The model introduces a binding site-focused graph construction approach to extract binding information, enabling more balanced and efficient modeling of drug-target interactions. Comprehensive experiments demonstrate DMFF-DTA outperforms state-of-the-art methods with significant improvements. The model exhibits excellent generalization capabilities on completely unseen drugs and targets, achieving an improvement of over 8% compared to existing methods. Model interpretability analysis validates the biological relevance of the model. A case study in pancreatic cancer drug repurposing demonstrates its practical utility. This work provides an interpretable, robust approach to integrate multi-view drug and protein features for advancing computational drug discovery.</p>","PeriodicalId":19349,"journal":{"name":"NPJ Digital Medicine","volume":"53 1","pages":""},"PeriodicalIF":15.2,"publicationDate":"2025-01-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143049983","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Rare diseases, affecting ~350 million people worldwide, pose significant challenges in clinical diagnosis due to the lack of experienced physicians and the complexity of differentiating between numerous rare diseases. To address these challenges, we introduce PhenoBrain, a fully automated artificial intelligence pipeline. PhenoBrain utilizes a BERT-based natural language processing model to extract phenotypes from clinical texts in EHRs and employs five new diagnostic models for differential diagnoses of rare diseases. The AI system was developed and evaluated on diverse, multi-country rare disease datasets, comprising 2271 cases with 431 rare diseases. In 1936 test cases, PhenoBrain achieved an average predicted top-3 recall of 0.513 and a top-10 recall of 0.654, surpassing 13 leading prediction methods. In a human-computer study with 75 cases, PhenoBrain exhibited exceptional performance with a top-3 recall of 0.613 and a top-10 recall of 0.813, surpassing the performance of 50 specialist physicians and large language models like ChatGPT and GPT-4. Combining PhenoBrain’s predictions with specialists increased the top-3 recall to 0.768, demonstrating its potential to enhance diagnostic accuracy in clinical workflows.
{"title":"A phenotype-based AI pipeline outperforms human experts in differentially diagnosing rare diseases using EHRs","authors":"Xiaohao Mao, Yu Huang, Ye Jin, Lun Wang, Xuanzhong Chen, Honghong Liu, Xinglin Yang, Haopeng Xu, Xiaodong Luan, Ying Xiao, Siqin Feng, Jiahao Zhu, Xuegong Zhang, Rui Jiang, Shuyang Zhang, Ting Chen","doi":"10.1038/s41746-025-01452-1","DOIUrl":"https://doi.org/10.1038/s41746-025-01452-1","url":null,"abstract":"<p>Rare diseases, affecting ~350 million people worldwide, pose significant challenges in clinical diagnosis due to the lack of experienced physicians and the complexity of differentiating between numerous rare diseases. To address these challenges, we introduce PhenoBrain, a fully automated artificial intelligence pipeline. PhenoBrain utilizes a BERT-based natural language processing model to extract phenotypes from clinical texts in EHRs and employs five new diagnostic models for differential diagnoses of rare diseases. The AI system was developed and evaluated on diverse, multi-country rare disease datasets, comprising 2271 cases with 431 rare diseases. In 1936 test cases, PhenoBrain achieved an average predicted top-3 recall of 0.513 and a top-10 recall of 0.654, surpassing 13 leading prediction methods. In a human-computer study with 75 cases, PhenoBrain exhibited exceptional performance with a top-3 recall of 0.613 and a top-10 recall of 0.813, surpassing the performance of 50 specialist physicians and large language models like ChatGPT and GPT-4. Combining PhenoBrain’s predictions with specialists increased the top-3 recall to 0.768, demonstrating its potential to enhance diagnostic accuracy in clinical workflows.</p>","PeriodicalId":19349,"journal":{"name":"NPJ Digital Medicine","volume":"118 1","pages":""},"PeriodicalIF":15.2,"publicationDate":"2025-01-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143054906","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}