Upper gastrointestinal cancer (UGC) sometimes metastasizes to the splenic hilum lymph node (SHLN). However, surgical removal of SHLN is technically difficult, and the risk of postoperative complications is high. Although there are models that predict SHLN metastasis, they usually only provide point estimates of risk, and there is a lack of sufficient information. To address this issue, we aimed to develop a Bayesian logistic regression model called Bayes-SHLNM. The performance of the models was compared with that of the frequentist logistic regression (FLR) model as a benchmark, and the posterior probability distribution (PPD) was shown individually. The performance of Bayes-SHLNM was equivalent to that of the FLR model, and the PPD for each case was visualized as the uncertainty. These results indicate that the Bayes-SHLNM model has the potential to be used as a decision support system in clinical settings where uncertainty is high.
{"title":"Establishment of a machine learning model for predicting splenic hilar lymph node metastasis","authors":"Kenichi Ishizu, Satoshi Takahashi, Nobuji Kouno, Ken Takasawa, Katsuji Takeda, Kota Matsui, Masashi Nishino, Tsutomu Hayashi, Yukinori Yamagata, Shigeyuki Matsui, Takaki Yoshikawa, Ryuji Hamamoto","doi":"10.1038/s41746-025-01480-x","DOIUrl":"https://doi.org/10.1038/s41746-025-01480-x","url":null,"abstract":"<p>Upper gastrointestinal cancer (UGC) sometimes metastasizes to the splenic hilum lymph node (SHLN). However, surgical removal of SHLN is technically difficult, and the risk of postoperative complications is high. Although there are models that predict SHLN metastasis, they usually only provide point estimates of risk, and there is a lack of sufficient information. To address this issue, we aimed to develop a Bayesian logistic regression model called Bayes-SHLNM. The performance of the models was compared with that of the frequentist logistic regression (FLR) model as a benchmark, and the posterior probability distribution (PPD) was shown individually. The performance of Bayes-SHLNM was equivalent to that of the FLR model, and the PPD for each case was visualized as the uncertainty. These results indicate that the Bayes-SHLNM model has the potential to be used as a decision support system in clinical settings where uncertainty is high.</p>","PeriodicalId":19349,"journal":{"name":"NPJ Digital Medicine","volume":"13 1","pages":""},"PeriodicalIF":15.2,"publicationDate":"2025-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143385461","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-11DOI: 10.1038/s41746-025-01489-2
Charles Alba, Bing Xue, Joanna Abraham, Thomas Kannampallil, Chenyang Lu
Clinical notes recorded during a patient’s perioperative journey holds immense informational value. Advances in large language models (LLMs) offer opportunities for bridging this gap. Using 84,875 preoperative notes and its associated surgical cases from 2018 to 2021, we examine the performance of LLMs in predicting six postoperative risks using various fine-tuning strategies. Pretrained LLMs outperformed traditional word embeddings by an absolute AUROC of 38.3% and AUPRC of 33.2%. Self-supervised fine-tuning further improved performance by 3.2% and 1.5%. Incorporating labels into training further increased AUROC by 1.8% and AUPRC by 2%. The highest performance was achieved with a unified foundation model, with improvements of 3.6% for AUROC and 2.6% for AUPRC compared to self-supervision, highlighting the foundational capabilities of LLMs in predicting postoperative risks, which could be potentially beneficial when deployed for perioperative care.
{"title":"The foundational capabilities of large language models in predicting postoperative risks using clinical notes","authors":"Charles Alba, Bing Xue, Joanna Abraham, Thomas Kannampallil, Chenyang Lu","doi":"10.1038/s41746-025-01489-2","DOIUrl":"https://doi.org/10.1038/s41746-025-01489-2","url":null,"abstract":"<p>Clinical notes recorded during a patient’s perioperative journey holds immense informational value. Advances in large language models (LLMs) offer opportunities for bridging this gap. Using 84,875 preoperative notes and its associated surgical cases from 2018 to 2021, we examine the performance of LLMs in predicting six postoperative risks using various fine-tuning strategies. Pretrained LLMs outperformed traditional word embeddings by an absolute AUROC of 38.3% and AUPRC of 33.2%. Self-supervised fine-tuning further improved performance by 3.2% and 1.5%. Incorporating labels into training further increased AUROC by 1.8% and AUPRC by 2%. The highest performance was achieved with a unified foundation model, with improvements of 3.6% for AUROC and 2.6% for AUPRC compared to self-supervision, highlighting the foundational capabilities of LLMs in predicting postoperative risks, which could be potentially beneficial when deployed for perioperative care.</p>","PeriodicalId":19349,"journal":{"name":"NPJ Digital Medicine","volume":"29 1","pages":""},"PeriodicalIF":15.2,"publicationDate":"2025-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143393111","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-10DOI: 10.1038/s41746-025-01444-1
Yuta Takahashi, Hayato Idei, Misako Komatsu, Jun Tani, Hiroaki Tomita, Yuichi Yamashita
At the forefront of bridging computational brain modeling with personalized medicine, this study introduces a novel, real-time, electrocorticogram (ECoG) simulator, based on the digital twin brain concept. Utilizing advanced data assimilation techniques, specifically a Variational Bayesian Recurrent Neural Network model with hierarchical latent units, the simulator dynamically predicts ECoG signals reflecting real-time brain latent states. By assimilating broad ECoG signals from macaque monkeys across awake and anesthetized conditions, the model successfully updated its latent states in real-time, enhancing precision of ECoG signal simulations. Behind successful data assimilation, self-organization of latent states in the model was observed, reflecting brain states and individuality. This self-organization facilitated simulation of virtual drug administration and uncovered functional networks underlying changes in brain function during anesthesia. These results show that the proposed model can simulate brain signals in real-time with high accuracy and is also useful for revealing underlying information processing dynamics.
{"title":"Digital twin brain simulator for real-time consciousness monitoring and virtual intervention using primate electrocorticogram data","authors":"Yuta Takahashi, Hayato Idei, Misako Komatsu, Jun Tani, Hiroaki Tomita, Yuichi Yamashita","doi":"10.1038/s41746-025-01444-1","DOIUrl":"https://doi.org/10.1038/s41746-025-01444-1","url":null,"abstract":"<p>At the forefront of bridging computational brain modeling with personalized medicine, this study introduces a novel, real-time, electrocorticogram (ECoG) simulator, based on the digital twin brain concept. Utilizing advanced data assimilation techniques, specifically a Variational Bayesian Recurrent Neural Network model with hierarchical latent units, the simulator dynamically predicts ECoG signals reflecting real-time brain latent states. By assimilating broad ECoG signals from macaque monkeys across awake and anesthetized conditions, the model successfully updated its latent states in real-time, enhancing precision of ECoG signal simulations. Behind successful data assimilation, self-organization of latent states in the model was observed, reflecting brain states and individuality. This self-organization facilitated simulation of virtual drug administration and uncovered functional networks underlying changes in brain function during anesthesia. These results show that the proposed model can simulate brain signals in real-time with high accuracy and is also useful for revealing underlying information processing dynamics.</p>","PeriodicalId":19349,"journal":{"name":"NPJ Digital Medicine","volume":"41 1","pages":""},"PeriodicalIF":15.2,"publicationDate":"2025-02-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143375356","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-09DOI: 10.1038/s41746-025-01493-6
Samir Akre, Zachary D. Cohen, Amelia Welborn, Tomislav D. Zbozinek, Brunilda Balliu, Michelle G. Craske, Alex A. T. Bui
This study examines the relationship between self-reported and physiologically measured sleep quality and their impact on neurocognitive performance in individuals with depression. Using data from 249 participants with medium to severe depression monitored over 13 weeks, sleep quality was assessed via retrospective self-report and physiological measures from consumer smartphones and smartwatches. Correlations between self-reported and physiological sleep measures were generally weak. Machine learning models revealed that self-reported sleep quality could detect all depression symptoms measured using the Patient Health Questionnaire-14, whereas physiological sleep measures detected “sleeping too much” and low libido. Notably, only self-reported sleep disturbances correlated significantly with neurocognitive performance, specifically with processing speed. Physiological sleep was able to detect changes in self-reported sleep, medication use, and sleep latency. These findings emphasize that self-reported and physiological sleep quality are not measuring the same construct, and both are important to monitor when studying sleep quality in relation to depression.
{"title":"Comparing self reported and physiological sleep quality from consumer devices to depression and neurocognitive performance","authors":"Samir Akre, Zachary D. Cohen, Amelia Welborn, Tomislav D. Zbozinek, Brunilda Balliu, Michelle G. Craske, Alex A. T. Bui","doi":"10.1038/s41746-025-01493-6","DOIUrl":"https://doi.org/10.1038/s41746-025-01493-6","url":null,"abstract":"<p>This study examines the relationship between self-reported and physiologically measured sleep quality and their impact on neurocognitive performance in individuals with depression. Using data from 249 participants with medium to severe depression monitored over 13 weeks, sleep quality was assessed via retrospective self-report and physiological measures from consumer smartphones and smartwatches. Correlations between self-reported and physiological sleep measures were generally weak. Machine learning models revealed that self-reported sleep quality could detect all depression symptoms measured using the Patient Health Questionnaire-14, whereas physiological sleep measures detected “sleeping too much” and low libido. Notably, only self-reported sleep disturbances correlated significantly with neurocognitive performance, specifically with processing speed. Physiological sleep was able to detect changes in self-reported sleep, medication use, and sleep latency. These findings emphasize that self-reported and physiological sleep quality are not measuring the same construct, and both are important to monitor when studying sleep quality in relation to depression.</p>","PeriodicalId":19349,"journal":{"name":"NPJ Digital Medicine","volume":"85 1","pages":""},"PeriodicalIF":15.2,"publicationDate":"2025-02-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143371602","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-06DOI: 10.1038/s41746-025-01453-0
Casper van der Zee, Heshow Jamal, Marc Muijzer, Laurence Frank, Gerko Vink, Robert Wisse
Refractive errors are the leading cause of preventable visual impairment, to which web-based remote refraction could contribute. We report real-world 2021–2022 data of the underlying algorithm and validated these to conventional prescriptions among healthy individuals (high visual acuity and satisfactied current refraction). Participants were 18–45 years with a spherical (S) error between −3.50 + 2.00S to −2.00 Diopter Cylinder (DC), reported as Spherical Equivalent (SEQ) in mean differences and 95% Limits of agreement. Consecutive measurements (n = 14,680) were assessed of which n = 6386 selected for validation. The mean difference was 0.01D(SD 0.69) and −0.73D(SD 0.92) for myopes and hyperopes respectively. This algorithm shows variation, nonetheless, 67% and 82% of myopes were within ±0.5 and ±0.75D. The test underestimates hyperopes (34% and 50% within ±0.5D and ±0.75D) and had inconsistencies distinguishing hyperopia. This proof-of-concept shows home testing has the potency to increase accessibility to care by delivering a valuable alternative for uncomplicated refractive assessments.
{"title":"Real world data on digital remote refraction in a healthy population of 14,680 eyes","authors":"Casper van der Zee, Heshow Jamal, Marc Muijzer, Laurence Frank, Gerko Vink, Robert Wisse","doi":"10.1038/s41746-025-01453-0","DOIUrl":"https://doi.org/10.1038/s41746-025-01453-0","url":null,"abstract":"<p>Refractive errors are the leading cause of preventable visual impairment, to which web-based remote refraction could contribute. We report real-world 2021–2022 data of the underlying algorithm and validated these to conventional prescriptions among healthy individuals (high visual acuity and satisfactied current refraction). Participants were 18–45 years with a spherical (S) error between −3.50 + 2.00S to −2.00 Diopter Cylinder (DC), reported as Spherical Equivalent (SEQ) in mean differences and 95% Limits of agreement. Consecutive measurements (<i>n</i> = 14,680) were assessed of which <i>n</i> = 6386 selected for validation. The mean difference was 0.01D(SD 0.69) and −0.73D(SD 0.92) for myopes and hyperopes respectively. This algorithm shows variation, nonetheless, 67% and 82% of myopes were within ±0.5 and ±0.75D. The test underestimates hyperopes (34% and 50% within ±0.5D and ±0.75D) and had inconsistencies distinguishing hyperopia. This proof-of-concept shows home testing has the potency to increase accessibility to care by delivering a valuable alternative for uncomplicated refractive assessments.</p>","PeriodicalId":19349,"journal":{"name":"NPJ Digital Medicine","volume":"87 1","pages":""},"PeriodicalIF":15.2,"publicationDate":"2025-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143191709","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-06DOI: 10.1038/s41746-024-01426-9
Ella Goldschmidt, Ella Rannon, Daniel Bernstein, Asaf Wasserman, Michael Roimi, Anat Shrot, Dan Coster, Ron Shamir
Antimicrobial resistance is a rising global health threat, leading to ineffective treatments, increased mortality and rising healthcare costs. In ICUs, inappropriate empiric antibiotic therapy is often given due to treatment urgency, causing poor outcomes. This study developed a machine learning model to predict the appropriateness of empiric antibiotics for ICU-acquired bloodstream infections, using data from the MIMIC-III database. To address missing values and dataset imbalances, novel computational methods were introduced. The model achieved an AUROC of 77.3% and AUPRC of 40.4% on validation, with similar results on external datasets from MIMIC-IV and Rambam Hospital. The model also predicted mortality risk, identifying a 30% mortality rate in high-risk patients versus 16.8% in low-risk groups. External validation on the eICU database showed a comparable gap, with mortality rates at 24% for high-risk and 7.7% for low-risk groups. Our study demonstrates the potential of machine learning models to predict inappropriate empiric antibiotic treatment.
{"title":"Predicting appropriateness of antibiotic treatment among ICU patients with hospital-acquired infection","authors":"Ella Goldschmidt, Ella Rannon, Daniel Bernstein, Asaf Wasserman, Michael Roimi, Anat Shrot, Dan Coster, Ron Shamir","doi":"10.1038/s41746-024-01426-9","DOIUrl":"https://doi.org/10.1038/s41746-024-01426-9","url":null,"abstract":"<p>Antimicrobial resistance is a rising global health threat, leading to ineffective treatments, increased mortality and rising healthcare costs. In ICUs, inappropriate empiric antibiotic therapy is often given due to treatment urgency, causing poor outcomes. This study developed a machine learning model to predict the appropriateness of empiric antibiotics for ICU-acquired bloodstream infections, using data from the MIMIC-III database. To address missing values and dataset imbalances, novel computational methods were introduced. The model achieved an AUROC of 77.3% and AUPRC of 40.4% on validation, with similar results on external datasets from MIMIC-IV and Rambam Hospital. The model also predicted mortality risk, identifying a 30% mortality rate in high-risk patients versus 16.8% in low-risk groups. External validation on the eICU database showed a comparable gap, with mortality rates at 24% for high-risk and 7.7% for low-risk groups. Our study demonstrates the potential of machine learning models to predict inappropriate empiric antibiotic treatment.</p>","PeriodicalId":19349,"journal":{"name":"NPJ Digital Medicine","volume":"15 1","pages":""},"PeriodicalIF":15.2,"publicationDate":"2025-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143191708","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-06DOI: 10.1038/s41746-025-01434-3
Malte Tölle, Philipp Garthe, Clemens Scherer, Jan Moritz Seliger, Andreas Leha, Nina Krüger, Stefan Simm, Simon Martin, Sebastian Eble, Halvar Kelm, Moritz Bednorz, Florian André, Peter Bannas, Gerhard Diller, Norbert Frey, Stefan Groß, Anja Hennemuth, Lars Kaderali, Alexander Meyer, Eike Nagel, Stefan Orwat, Moritz Seiffert, Tim Friede, Tim Seidler, Sandy Engelhardt
Federated learning is a renowned technique for utilizing decentralized data while preserving privacy. However, real-world applications often face challenges like partially labeled datasets, where only a few locations have certain expert annotations, leaving large portions of unlabeled data unused. Leveraging these could enhance transformer architectures’ ability in regimes with small and diversely annotated sets. We conduct the largest federated cardiac CT analysis to date (n = 8, 104) in a real-world setting across eight hospitals. Our two-step semi-supervised strategy distills knowledge from task-specific CNNs into a transformer. First, CNNs predict on unlabeled data per label type and then the transformer learns from these predictions with label-specific heads. This improves predictive accuracy and enables simultaneous learning of all partial labels across the federation, and outperforms UNet-based models in generalizability on downstream tasks. Code and model weights are made openly available for leveraging future cardiac CT analysis.
{"title":"Real world federated learning with a knowledge distilled transformer for cardiac CT imaging","authors":"Malte Tölle, Philipp Garthe, Clemens Scherer, Jan Moritz Seliger, Andreas Leha, Nina Krüger, Stefan Simm, Simon Martin, Sebastian Eble, Halvar Kelm, Moritz Bednorz, Florian André, Peter Bannas, Gerhard Diller, Norbert Frey, Stefan Groß, Anja Hennemuth, Lars Kaderali, Alexander Meyer, Eike Nagel, Stefan Orwat, Moritz Seiffert, Tim Friede, Tim Seidler, Sandy Engelhardt","doi":"10.1038/s41746-025-01434-3","DOIUrl":"https://doi.org/10.1038/s41746-025-01434-3","url":null,"abstract":"<p>Federated learning is a renowned technique for utilizing decentralized data while preserving privacy. However, real-world applications often face challenges like partially labeled datasets, where only a few locations have certain expert annotations, leaving large portions of unlabeled data unused. Leveraging these could enhance transformer architectures’ ability in regimes with small and diversely annotated sets. We conduct the largest federated cardiac CT analysis to date (<i>n</i> = 8, 104) in a real-world setting across eight hospitals. Our two-step semi-supervised strategy distills knowledge from task-specific CNNs into a transformer. First, CNNs predict on unlabeled data per label type and then the transformer learns from these predictions with label-specific heads. This improves predictive accuracy and enables simultaneous learning of all partial labels across the federation, and outperforms UNet-based models in generalizability on downstream tasks. Code and model weights are made openly available for leveraging future cardiac CT analysis.</p>","PeriodicalId":19349,"journal":{"name":"NPJ Digital Medicine","volume":"137 1","pages":""},"PeriodicalIF":15.2,"publicationDate":"2025-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143191873","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-06DOI: 10.1038/s41746-025-01485-6
Peter C. Nauka, Jason N. Kennedy, Emily B. Brant, Matthieu Komorowski, Romain Pirracchio, Derek C. Angus, Christopher W. Seymour
Pivotal moments in sepsis care occur in the emergency department (ED), however, and it is unclear whether ED data is adequate to inform reinforcement learning (RL) models. We evaluated the early opportunity for the AI Clinician, a validated ICU-based RL-model, as a use case. Amongst emergency sepsis patients, model parameters were often missing and invariably measured. Current iterations of RL-models trained on ICU data face challenges in emergency sepsis care.
{"title":"Challenges with reinforcement learning model transportability for sepsis treatment in emergency care","authors":"Peter C. Nauka, Jason N. Kennedy, Emily B. Brant, Matthieu Komorowski, Romain Pirracchio, Derek C. Angus, Christopher W. Seymour","doi":"10.1038/s41746-025-01485-6","DOIUrl":"https://doi.org/10.1038/s41746-025-01485-6","url":null,"abstract":"<p>Pivotal moments in sepsis care occur in the emergency department (ED), however, and it is unclear whether ED data is adequate to inform reinforcement learning (RL) models. We evaluated the early opportunity for the AI Clinician, a validated ICU-based RL-model, as a use case. Amongst emergency sepsis patients, model parameters were often missing and invariably measured. Current iterations of RL-models trained on ICU data face challenges in emergency sepsis care.</p>","PeriodicalId":19349,"journal":{"name":"NPJ Digital Medicine","volume":"9 1","pages":""},"PeriodicalIF":15.2,"publicationDate":"2025-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143258487","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-06DOI: 10.1038/s41746-024-01339-7
Fenglin Liu, Zheng Li, Qingyu Yin, Jinfa Huang, Jiebo Luo, Anshul Thakur, Kim Branson, Patrick Schwab, Bing Yin, Xian Wu, Yefeng Zheng, David A. Clifton
Radiology images are one of the most commonly used in daily clinical diagnosis. Typically, clinical diagnosis using radiology images involves disease reporting and classification, where the former is a multimodal task whereby textual reports are generated to describe clinical findings in images, as are common in various domains, e.g., chest X-ray or computed tomography. Existing approaches are mainly supervised, the quality of which heavily depends on the volume and quality of available labeled data. However, for rarer or more novel diseases, enrolling patients to collect data is both time-consuming and expensive. For non-English languages, sufficient quantities of labeled data are typically not available. We propose the Multimodal Multidomain Multilingual Foundation Model. It is useful for rare diseases and non-English languages, where the labeled data are frequently much more scarce, and may even be absent. Our approach achieves encouraging performances on nine datasets, including 2 infectious and 14 non-infectious diseases.
{"title":"A multimodal multidomain multilingual medical foundation model for zero shot clinical diagnosis","authors":"Fenglin Liu, Zheng Li, Qingyu Yin, Jinfa Huang, Jiebo Luo, Anshul Thakur, Kim Branson, Patrick Schwab, Bing Yin, Xian Wu, Yefeng Zheng, David A. Clifton","doi":"10.1038/s41746-024-01339-7","DOIUrl":"https://doi.org/10.1038/s41746-024-01339-7","url":null,"abstract":"<p>Radiology images are one of the most commonly used in daily clinical diagnosis. Typically, clinical diagnosis using radiology images involves disease reporting and classification, where the former is a multimodal task whereby textual reports are generated to describe clinical findings in images, as are common in various domains, e.g., chest X-ray or computed tomography. Existing approaches are mainly supervised, the quality of which heavily depends on the volume and quality of available labeled data. However, for rarer or more novel diseases, enrolling patients to collect data is both time-consuming and expensive. For non-English languages, sufficient quantities of labeled data are typically not available. We propose the Multimodal Multidomain Multilingual Foundation Model. It is useful for rare diseases and non-English languages, where the labeled data are frequently much more scarce, and may even be absent. Our approach achieves encouraging performances on nine datasets, including 2 infectious and 14 non-infectious diseases.</p>","PeriodicalId":19349,"journal":{"name":"NPJ Digital Medicine","volume":"61 1","pages":""},"PeriodicalIF":15.2,"publicationDate":"2025-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143191763","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-06DOI: 10.1038/s41746-025-01459-8
Frank E. Rademakers, Elisabetta Biasin, Nico Bruining, Enrico G. Caiani, Rhodri H. Davies, Stephen H. Gilbert, Eric Kamenjasevic, Gearóid McGauran, Gearóid O’Connor, Jean-Baptiste Rouffet, Baptiste Vasey, Alan G. Fraser
The European CORE–MD consortium (Coordinating Research and Evidence for Medical Devices) proposes a score for medical devices incorporating artificial intelligence or machine learning algorithms. Its domains are summarised as valid clinical association, technical performance, and clinical performance. High scores indicate that extensive clinical investigations should be undertaken before regulatory approval, whereas lower scores indicate devices for which less pre-market clinical evaluation may be balanced by more post-market evidence.
{"title":"CORE-MD clinical risk score for regulatory evaluation of artificial intelligence-based medical device software","authors":"Frank E. Rademakers, Elisabetta Biasin, Nico Bruining, Enrico G. Caiani, Rhodri H. Davies, Stephen H. Gilbert, Eric Kamenjasevic, Gearóid McGauran, Gearóid O’Connor, Jean-Baptiste Rouffet, Baptiste Vasey, Alan G. Fraser","doi":"10.1038/s41746-025-01459-8","DOIUrl":"https://doi.org/10.1038/s41746-025-01459-8","url":null,"abstract":"<p>The European CORE–MD consortium (Coordinating Research and Evidence for Medical Devices) proposes a score for medical devices incorporating artificial intelligence or machine learning algorithms. Its domains are summarised as valid clinical association, technical performance, and clinical performance. High scores indicate that extensive clinical investigations should be undertaken before regulatory approval, whereas lower scores indicate devices for which less pre-market clinical evaluation may be balanced by more post-market evidence.</p>","PeriodicalId":19349,"journal":{"name":"NPJ Digital Medicine","volume":"11 1","pages":""},"PeriodicalIF":15.2,"publicationDate":"2025-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143191774","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}