Early detection of Alzheimer's disease (AD) is essential for effective clinical intervention and disease management. However, conventional Deep Learning (DL) methods face limitations in analyzing complex brain magnetic resonance imaging (MRI), especially when training data are scarce. In this study, we propose a Quantum-Enhanced Neural Network Architecture (QENNA) that integrates quantum convolutional layers with classical deep learning to improve diagnostic accuracy in early AD detection. The model also incorporates quantum data augmentation strategies, including Quantum Generative Adversarial Networks (QGANs) and quantum random walks, to generate high-fidelity synthetic MRI scans and address training data limitations. Experiments on two public MRI datasets demonstrate that QENNA achieves up to 93.0 % accuracy and 96.0 % Area Under the Curve (AUC), outperforming state-of-the-art classical models. Ablation studies confirm that the quantum components substantially enhance performance. These results suggest that quantum-enhanced learning frameworks can significantly advance Artificial Intelligence (AI)-driven diagnostic tools for neurodegenerative disorders and support scalable, early-stage AD screening in clinical practice.
{"title":"QENNA: A quantum-enhanced neural network for early Alzheimer's detection using magnetic resonance imaging","authors":"Chutchai Kaewta , Rapeepan Pitakaso , Surajet Khonjun , Thanatkij Srichok , Peerawat Luesak , Sarayut Gonwirat , Prem Enkvetchakul , Surasak Matitopanum , Thitinon Srisuwandee","doi":"10.1016/j.artmed.2025.103322","DOIUrl":"10.1016/j.artmed.2025.103322","url":null,"abstract":"<div><div>Early detection of Alzheimer's disease (AD) is essential for effective clinical intervention and disease management. However, conventional Deep Learning (DL) methods face limitations in analyzing complex brain magnetic resonance imaging (MRI), especially when training data are scarce. In this study, we propose a Quantum-Enhanced Neural Network Architecture (QENNA) that integrates quantum convolutional layers with classical deep learning to improve diagnostic accuracy in early AD detection. The model also incorporates quantum data augmentation strategies, including Quantum Generative Adversarial Networks (QGANs) and quantum random walks, to generate high-fidelity synthetic MRI scans and address training data limitations. Experiments on two public MRI datasets demonstrate that QENNA achieves up to 93.0 % accuracy and 96.0 % Area Under the Curve (AUC), outperforming state-of-the-art classical models. Ablation studies confirm that the quantum components substantially enhance performance. These results suggest that quantum-enhanced learning frameworks can significantly advance Artificial Intelligence (AI)-driven diagnostic tools for neurodegenerative disorders and support scalable, early-stage AD screening in clinical practice.</div></div>","PeriodicalId":55458,"journal":{"name":"Artificial Intelligence in Medicine","volume":"172 ","pages":"Article 103322"},"PeriodicalIF":6.2,"publicationDate":"2025-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145679400","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-29DOI: 10.1016/j.artmed.2025.103314
Wenxuan Mu , Di Zhao , Jiana Meng , Peng Chen , Shichang Sun , Yumeng Yang , Jian Wang , Hongfei Lin
Data Augmentation (DA) aims to create a new dataset to address the lack of data in various domains. Particularly in few-shot scenarios of the biomedical Named Entity Recognition (NER) domain, an effective DA method can enhance data diversity, reduce overfitting, and significantly improve the model’s generalization ability. In this work, we propose a novel DA method for NER tasks, which uses ChatGPT and prompt learning to extract high-quality data from large language models. The entity recognition tasks are then performed via transfer learning and efficient decoding strategies. Moreover, this study conducted extensive experiments on four publicly available biomedical datasets (BC5CDR, NCBI, BioNLP11EPI, and BioNLP13GE), demonstrating that our methods exhibit strong stability and entity recognition capabilities even in extremely limited scenarios. In the 5-shot, 20-shot, and 50-shot scenarios, the average F1 scores of the four datasets reached 72.96%, 75.05%, and 77.42%, respectively.
{"title":"Data Augmentation for Few-Shot Biomedical NER Using ChatGPT","authors":"Wenxuan Mu , Di Zhao , Jiana Meng , Peng Chen , Shichang Sun , Yumeng Yang , Jian Wang , Hongfei Lin","doi":"10.1016/j.artmed.2025.103314","DOIUrl":"10.1016/j.artmed.2025.103314","url":null,"abstract":"<div><div>Data Augmentation (DA) aims to create a new dataset to address the lack of data in various domains. Particularly in few-shot scenarios of the biomedical Named Entity Recognition (NER) domain, an effective DA method can enhance data diversity, reduce overfitting, and significantly improve the model’s generalization ability. In this work, we propose a novel DA method for NER tasks, which uses ChatGPT and prompt learning to extract high-quality data from large language models. The entity recognition tasks are then performed via transfer learning and efficient decoding strategies. Moreover, this study conducted extensive experiments on four publicly available biomedical datasets (BC5CDR, NCBI, BioNLP11EPI, and BioNLP13GE), demonstrating that our methods exhibit strong stability and entity recognition capabilities even in extremely limited scenarios. In the 5-shot, 20-shot, and 50-shot scenarios, the average F1 scores of the four datasets reached 72.96%, 75.05%, and 77.42%, respectively.</div></div>","PeriodicalId":55458,"journal":{"name":"Artificial Intelligence in Medicine","volume":"172 ","pages":"Article 103314"},"PeriodicalIF":6.2,"publicationDate":"2025-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145670906","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-28DOI: 10.1016/j.artmed.2025.103320
Mahdi Ghorbankhani, Maryam Safara
The integration of artificial intelligence (AI) into the field of mental health diagnosis has garnered increasing scholarly and clinical attention, particularly in relation to the early detection and classification of depression. This study offers a comprehensive review of the current landscape of AI-driven approaches for depression diagnosis, examining the methodologies, data modalities, and performance metrics employed across recent empirical investigations. Emphasizing machine learning and deep learning techniques, the study critically evaluates the utility of linguistic, behavioral, and physiological data sourced from social media, clinical interviews, speech recordings, and wearable devices. The findings suggest that AI systems, particularly those incorporating multimodal data fusion and advanced neural network architectures, demonstrate promising diagnostic accuracy and the potential to augment traditional psychiatric assessments. However, the study also identifies significant methodological, ethical, and practical challenges, including issues of dataset bias, algorithmic transparency, and clinical applicability. In response, the paper outlines key future directions aimed at improving model generalizability, enhancing interpretability, and fostering ethically responsible deployment in real-world settings. This review not only elucidates the transformative capacity of AI in mental health diagnostics but also provides a roadmap for advancing the development of robust, transparent, and clinically integrated AI systems for the detection of depression.
{"title":"Artificial intelligence in depression diagnostics: A systematic review of methodologies and clinical applications","authors":"Mahdi Ghorbankhani, Maryam Safara","doi":"10.1016/j.artmed.2025.103320","DOIUrl":"10.1016/j.artmed.2025.103320","url":null,"abstract":"<div><div>The integration of <em>artificial intelligence</em> (AI) into the field of mental health diagnosis has garnered increasing scholarly and clinical attention, particularly in relation to the early detection and classification of depression. This study offers a comprehensive review of the current landscape of AI-driven approaches for depression diagnosis, examining the methodologies, data modalities, and performance metrics employed across recent empirical investigations. Emphasizing machine learning and deep learning techniques, the study critically evaluates the utility of linguistic, behavioral, and physiological data sourced from social media, clinical interviews, speech recordings, and wearable devices. The findings suggest that AI systems, particularly those incorporating multimodal data fusion and advanced neural network architectures, demonstrate promising diagnostic accuracy and the potential to augment traditional psychiatric assessments. However, the study also identifies significant methodological, ethical, and practical challenges, including issues of dataset bias, algorithmic transparency, and clinical applicability. In response, the paper outlines key future directions aimed at improving model generalizability, enhancing interpretability, and fostering ethically responsible deployment in real-world settings. This review not only elucidates the transformative capacity of AI in mental health diagnostics but also provides a roadmap for advancing the development of robust, transparent, and clinically integrated AI systems for the detection of depression.</div></div>","PeriodicalId":55458,"journal":{"name":"Artificial Intelligence in Medicine","volume":"172 ","pages":"Article 103320"},"PeriodicalIF":6.2,"publicationDate":"2025-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145670930","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-27DOI: 10.1016/j.artmed.2025.103319
Zhixuan Zeng , Yang Liu , Shuo Yao , Xu Cai , Wenbin Nan , Yiyang Xie , Xun Gong
Background
ICU patients often suffer from critical and complex condition, and multiple potential risks should be monitored to provide them comprehensive care. However, no study proposes continual learning (CL) model that can effectively solve multiple clinical prediction tasks without catastrophic forgetting. This study proposes three deep CL models for ICU patients.
Methods
Three public ICU databases were employed. The included patients from MIMIC-III and MIMIC-IV were divided into eight task sets, and the patients from eICU-CRD composed the test set. We propose three CL models (CL_1, CL_2, CL_3) to sequentially learn eight prediction tasks on the eight task sets, and then externally validate them on the test set. We compare our models to three representative baseline CL models and the single-task (ST) and multi-task (MT) model. We train all the CL models under different orders, and evaluate their prediction performance by multiple metrics and their memory ability by backward transfer (BWT). We also analyzed the effect of previously learned tasks on learning new tasks.
Results
Our three CL models had comparable or slightly weaker performance compared to ST and MT model on the eight tasks. They effectively mitigated catastrophic forgetting, and their performance is robust to different training orders. CL_2 and CL_3 even have improved performance on the current task after learning some previous tasks. Our three CL models outperformed the baseline CL models in most experiments.
Conclusions
Our CL models are promising to sequentially learn multiple clinical prediction tasks for ICU patients. The CL_2 and CL_3 show the ability of utilizing information of previous tasks to improve learning new tasks. More new datasets and tasks are still needed to further verify the validity of the CL models.
{"title":"Development and validation of deep continual learning model to sequentially learn multiple clinical prediction tasks for ICU patients","authors":"Zhixuan Zeng , Yang Liu , Shuo Yao , Xu Cai , Wenbin Nan , Yiyang Xie , Xun Gong","doi":"10.1016/j.artmed.2025.103319","DOIUrl":"10.1016/j.artmed.2025.103319","url":null,"abstract":"<div><h3>Background</h3><div>ICU patients often suffer from critical and complex condition, and multiple potential risks should be monitored to provide them comprehensive care. However, no study proposes continual learning (CL) model that can effectively solve multiple clinical prediction tasks without catastrophic forgetting. This study proposes three deep CL models for ICU patients.</div></div><div><h3>Methods</h3><div>Three public ICU databases were employed. The included patients from MIMIC-III and MIMIC-IV were divided into eight task sets, and the patients from eICU-CRD composed the test set. We propose three CL models (CL_1, CL_2, CL_3) to sequentially learn eight prediction tasks on the eight task sets, and then externally validate them on the test set. We compare our models to three representative baseline CL models and the single-task (ST) and multi-task (MT) model. We train all the CL models under different orders, and evaluate their prediction performance by multiple metrics and their memory ability by backward transfer (BWT). We also analyzed the effect of previously learned tasks on learning new tasks.</div></div><div><h3>Results</h3><div>Our three CL models had comparable or slightly weaker performance compared to ST and MT model on the eight tasks. They effectively mitigated catastrophic forgetting, and their performance is robust to different training orders. CL_2 and CL_3 even have improved performance on the current task after learning some previous tasks. Our three CL models outperformed the baseline CL models in most experiments.</div></div><div><h3>Conclusions</h3><div>Our CL models are promising to sequentially learn multiple clinical prediction tasks for ICU patients. The CL_2 and CL_3 show the ability of utilizing information of previous tasks to improve learning new tasks. More new datasets and tasks are still needed to further verify the validity of the CL models.</div></div>","PeriodicalId":55458,"journal":{"name":"Artificial Intelligence in Medicine","volume":"172 ","pages":"Article 103319"},"PeriodicalIF":6.2,"publicationDate":"2025-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145685084","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-27DOI: 10.1016/j.artmed.2025.103318
Gondy Leroy , Prakash Bisht , Sai Madhuri Kandula , Nell Maltman , Sydney Rice
Autism spectrum disorder (ASD) is a complex neurodevelopmental condition whose rising prevalence places increasing demands on a lengthy diagnostic process. Machine learning (ML) has shown promise in automating ASD diagnosis, but most existing models operate as black boxes and are typically trained on a single dataset, limiting their generalizability.
In this study, we introduce a transparent and interpretable ML approach that leverages BioBERT, a state-of-the-art language model, to analyze unstructured clinical text. The model is trained to label descriptions of behaviors and map them to diagnostic criteria, which are then used to assign a final label (ASD or not). We evaluate transfer learning, the ability to transfer knowledge to new data, using two distinct real-world datasets. We trained on datasets sequentially and mixed together and compared the performance of the best models and their ability to transfer to new data. We also created a black-box approach and repeated this transfer process for comparison.
Our transparent model demonstrated robust performance, with the mixed-data training strategy yielding the best results (97 % sensitivity, 98 % specificity). Sequential training across datasets led to a slight drop in performance, highlighting the importance of training data order. The black-box model performed worse (90 % sensitivity, 96 % specificity) when trained sequentially or with mixed data.
Overall, our transparent approach outperformed the black-box approach. Mixing datasets during training resulted in slightly better performance and should be the preferred approach when practically possible. This work paves the way for more trustworthy, generalizable, and clinically actionable AI tools in neurodevelopmental diagnostics.
{"title":"Deep learning for autism detection using clinical notes: A comparison of transfer learning for a transparent and black-box approach","authors":"Gondy Leroy , Prakash Bisht , Sai Madhuri Kandula , Nell Maltman , Sydney Rice","doi":"10.1016/j.artmed.2025.103318","DOIUrl":"10.1016/j.artmed.2025.103318","url":null,"abstract":"<div><div>Autism spectrum disorder (ASD) is a complex neurodevelopmental condition whose rising prevalence places increasing demands on a lengthy diagnostic process. Machine learning (ML) has shown promise in automating ASD diagnosis, but most existing models operate as black boxes and are typically trained on a single dataset, limiting their generalizability.</div><div>In this study, we introduce a transparent and interpretable ML approach that leverages BioBERT, a state-of-the-art language model, to analyze unstructured clinical text. The model is trained to label descriptions of behaviors and map them to diagnostic criteria, which are then used to assign a final label (ASD or not). We evaluate transfer learning, the ability to transfer knowledge to new data, using two distinct real-world datasets. We trained on datasets sequentially and mixed together and compared the performance of the best models and their ability to transfer to new data. We also created a black-box approach and repeated this transfer process for comparison.</div><div>Our transparent model demonstrated robust performance, with the mixed-data training strategy yielding the best results (97 % sensitivity, 98 % specificity). Sequential training across datasets led to a slight drop in performance, highlighting the importance of training data order. The black-box model performed worse (90 % sensitivity, 96 % specificity) when trained sequentially or with mixed data.</div><div>Overall, our transparent approach outperformed the black-box approach. Mixing datasets during training resulted in slightly better performance and should be the preferred approach when practically possible. This work paves the way for more trustworthy, generalizable, and clinically actionable AI tools in neurodevelopmental diagnostics.</div></div>","PeriodicalId":55458,"journal":{"name":"Artificial Intelligence in Medicine","volume":"172 ","pages":"Article 103318"},"PeriodicalIF":6.2,"publicationDate":"2025-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145662947","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-27DOI: 10.1016/j.artmed.2025.103321
Chiara Barbati , Luca Viviani , Riccardo Vecchio , Guglielmo Arzilli , Luigi De Angelis , Francesco Baglivo , Lucia Sacchi , Riccardo Bellazzi , Caterina Rizzo , Anna Odone
Objectives
The increasing digitisation of healthcare data and the rapid development of Artificial Intelligence (AI) pave the way for innovative strategies for infectious disease management. This study aimed to systematically retrieve and summarize current evidence on the use and performance of AI-based models for healthcare-associated infection (HAI) detection (i.e., identifying infections already present in available data) and prediction (i.e., estimating future risk based on earlier patient information).
Methods
PubMed, Embase, Scopus and Web of Science were searched for experimental and observational studies published between 1 July 2018 and 12 February 2024. Primary outcomes included technical performance metrics for HAI detection and prediction (e.g. recall, precision, AUROC). Any reported clinical, organisational or economic impacts were evaluated as secondary outcomes.
Results
Of 4489 records initially identified, 121 studies were included. Twenty-five studies (20.6 %) focused on HAI detection, with more than half achieving an AUROC above 0.90. In contrast, studies on HAI prediction (n = 93, 76.9 %) reported more heterogeneous performance. Among studies comparing AI with traditional methods (n = 32), AI models outperformed conventional approaches in 81.3 % of cases (n = 26).
Conclusions
A growing body of evidence suggests that AI models are equal to or superior to traditional methods for HAI detection and prediction, but challenges remain in evaluating performance, with many studies lacking comparators, few prospective evaluations, and limited assessment of organisational impact.
目标:医疗数据的日益数字化和人工智能(AI)的快速发展为传染病管理的创新战略铺平了道路。本研究旨在系统地检索和总结基于人工智能的医疗保健相关感染(HAI)检测(即识别现有数据中已经存在的感染)和预测(即根据早期患者信息估计未来风险)模型的使用和性能的现有证据。方法:检索PubMed、Embase、Scopus和Web of Science,检索2018年7月1日至2024年2月12日发表的实验和观察性研究。主要结果包括HAI检测和预测的技术性能指标(如召回率、精度、AUROC)。任何报告的临床、组织或经济影响被评估为次要结局。结果:在最初确定的4489份记录中,121份研究被纳入。25项研究(20.6%)集中在HAI检测上,超过一半的AUROC高于0.90。相比之下,对HAI预测的研究(n = 93, 76.9%)报告了更多的异质性表现。在比较人工智能与传统方法的研究中(n = 32),人工智能模型在81.3%的情况下优于传统方法(n = 26)。结论:越来越多的证据表明,人工智能模型等于或优于传统的HAI检测和预测方法,但在评估性能方面仍然存在挑战,许多研究缺乏比较物,前瞻性评估很少,对组织影响的评估有限。
{"title":"Artificial intelligence use and performance in detecting and predicting healthcare-associated infections: A systematic review","authors":"Chiara Barbati , Luca Viviani , Riccardo Vecchio , Guglielmo Arzilli , Luigi De Angelis , Francesco Baglivo , Lucia Sacchi , Riccardo Bellazzi , Caterina Rizzo , Anna Odone","doi":"10.1016/j.artmed.2025.103321","DOIUrl":"10.1016/j.artmed.2025.103321","url":null,"abstract":"<div><h3>Objectives</h3><div>The increasing digitisation of healthcare data and the rapid development of Artificial Intelligence (AI) pave the way for innovative strategies for infectious disease management. This study aimed to systematically retrieve and summarize current evidence on the use and performance of AI-based models for healthcare-associated infection (HAI) detection (i.e., identifying infections already present in available data) and prediction (i.e., estimating future risk based on earlier patient information).</div></div><div><h3>Methods</h3><div>PubMed, Embase, Scopus and Web of Science were searched for experimental and observational studies published between 1 July 2018 and 12 February 2024. Primary outcomes included technical performance metrics for HAI detection and prediction (e.g. recall, precision, AUROC). Any reported clinical, organisational or economic impacts were evaluated as secondary outcomes.</div></div><div><h3>Results</h3><div>Of 4489 records initially identified, 121 studies were included. Twenty-five studies (20.6 %) focused on HAI detection, with more than half achieving an AUROC above 0.90. In contrast, studies on HAI prediction (<em>n</em> = 93, 76.9 %) reported more heterogeneous performance. Among studies comparing AI with traditional methods (<em>n</em> = 32), AI models outperformed conventional approaches in 81.3 % of cases (<em>n</em> = 26).</div></div><div><h3>Conclusions</h3><div>A growing body of evidence suggests that AI models are equal to or superior to traditional methods for HAI detection and prediction, but challenges remain in evaluating performance, with many studies lacking comparators, few prospective evaluations, and limited assessment of organisational impact.</div></div>","PeriodicalId":55458,"journal":{"name":"Artificial Intelligence in Medicine","volume":"172 ","pages":"Article 103321"},"PeriodicalIF":6.2,"publicationDate":"2025-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145670925","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-26DOI: 10.1016/j.artmed.2025.103309
Anjana Wijekoon , Adrito Das , Zhehua Mao , Danyal Z. Khan , John G. Hanrahan , Danail Stoyanov , Hani J. Marcus , Sophia Bano
Surgical workflow recognition has the potential to accelerate training initiatives through the analysis of surgical videos, improve intraoperative efficiency, and support preemptive postoperative care. Unlike well-explored minimally invasive surgeries, where surgical workflows are consistent across patients, automating endoscopic pituitary surgery workflow recognition is challenging. Pituitary surgery involves a large number of steps, diverse sequences, optional steps, and frequent transitions, making it challenging for current state-of-the-art (SOTA) methods, which struggle with transferability. Progress is largely limited by the lack of annotated data that captures the complexity of pituitary surgery, and obtaining such annotations is both time-consuming and resource-intensive. This paper presents SurgflowNet, a novel spatio-temporal model for consistent pituitary workflow recognition leveraging unannotated data. We utilise a limited yet fully annotated dataset to infer quasi-labels for unannotated videos and curate a balanced dataset to train a robust frame encoder using the student–teacher framework. A spatio-temporal network that combines the resulting frame encoder and an LSTM network is trained with a consistency loss to ensure stability in step predictions. With a 5% improvement in macro F1-score and 13.4% in Edit Score over the SOTA, SurgflowNetdemonstrates a significant improvement in workflow recognition for endoscopic pituitary surgery.
{"title":"SurgflowNet: Leveraging unannotated video for consistent endoscopic pituitary surgery workflow recognition","authors":"Anjana Wijekoon , Adrito Das , Zhehua Mao , Danyal Z. Khan , John G. Hanrahan , Danail Stoyanov , Hani J. Marcus , Sophia Bano","doi":"10.1016/j.artmed.2025.103309","DOIUrl":"10.1016/j.artmed.2025.103309","url":null,"abstract":"<div><div>Surgical workflow recognition has the potential to accelerate training initiatives through the analysis of surgical videos, improve intraoperative efficiency, and support preemptive postoperative care. Unlike well-explored minimally invasive surgeries, where surgical workflows are consistent across patients, automating endoscopic pituitary surgery workflow recognition is challenging. Pituitary surgery involves a large number of steps, diverse sequences, optional steps, and frequent transitions, making it challenging for current state-of-the-art (SOTA) methods, which struggle with transferability. Progress is largely limited by the lack of annotated data that captures the complexity of pituitary surgery, and obtaining such annotations is both time-consuming and resource-intensive. This paper presents SurgflowNet, a novel spatio-temporal model for consistent pituitary workflow recognition leveraging unannotated data. We utilise a limited yet fully annotated dataset to infer quasi-labels for unannotated videos and curate a balanced dataset to train a robust frame encoder using the student–teacher framework. A spatio-temporal network that combines the resulting frame encoder and an LSTM network is trained with a consistency loss to ensure stability in step predictions. With a 5% improvement in macro F<sub>1</sub>-score and 13.4% in Edit Score over the SOTA, SurgflowNetdemonstrates a significant improvement in workflow recognition for endoscopic pituitary surgery.</div></div>","PeriodicalId":55458,"journal":{"name":"Artificial Intelligence in Medicine","volume":"172 ","pages":"Article 103309"},"PeriodicalIF":6.2,"publicationDate":"2025-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145662970","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-26DOI: 10.1016/j.artmed.2025.103315
Minghui Tan , Siyuan Tang , Zhao Ni , Shichao Kan , Paul Macharia , Haojie Zhang , Hao Yi , Guo Li , Jinfeng Ding
Background
Advance care planning (ACP) is a process that enables individuals to discuss future health care decisions before they become seriously ill or unable to communicate. Artificial intelligence (AI) has demonstrated promising outcomes in facilitating healthcare, offering the potential to facilitate ACP. However, the current status of using AI to facilitate ACP is unclear. This study aimed to investigate how AI has been leveraged to facilitate ACP, with a particular focus on the intended purposes, AI algorithms used, data sources, and the performance of AI in achieving the intended purposes.
Methods
The methodology employed in this study adhered to the Scoping Review Methodological Framework. PubMed, EMBASE, Web of Science, CINAHL, Cochrane Library, and IEEE Xplore databases were searched from their inception to July 2025. Descriptive analyses and narrative synthesis were used to summarize findings from the included studies.
Results
A total of 42 eligible studies were analyzed. The studies were primarily used to detect ACP conversations and documents, identify patients needing ACP, promote ACP education, and explore linguistic features in ACP conversations. Rule-based natural language processing emerged as the most commonly used AI algorithm, with textual data being the primary modality employed. The included studies exhibited significant variation in performance evaluation.
Finding
The current use of AI in ACP remains limited in scope, primarily focusing on extracting ACP documentation from electronic health records and identifying patients who may benefit from ACP. The use of advanced technologies such as generative AI is limited, and performance evaluation primarily relies on discrimination metrics.
预先护理计划(ACP)是一个过程,使个人能够讨论未来的医疗保健决策之前,他们变得严重疾病或无法沟通。人工智能(AI)在促进医疗保健方面已显示出有希望的成果,为促进ACP提供了潜力。然而,利用人工智能促进ACP的现状尚不清楚。本研究旨在调查人工智能如何被利用来促进ACP,特别关注预期目的、使用的人工智能算法、数据源以及人工智能在实现预期目的方面的表现。方法本研究采用的方法学遵循范围审查方法学框架。检索了PubMed、EMBASE、Web of Science、CINAHL、Cochrane Library和IEEE explore数据库,检索时间从它们成立到2025年7月。使用描述性分析和叙述性综合来总结纳入研究的结果。结果共分析了42项符合条件的研究。本研究主要用于检测ACP会话和文献,识别需要ACP的患者,促进ACP教育,探索ACP会话的语言特征。基于规则的自然语言处理成为最常用的人工智能算法,文本数据是使用的主要形式。纳入的研究在绩效评价方面表现出显著差异。目前人工智能在ACP中的应用范围仍然有限,主要集中在从电子健康记录中提取ACP文档和识别可能受益于ACP的患者。生成式人工智能等先进技术的使用受到限制,绩效评估主要依赖于歧视指标。
{"title":"Leveraging artificial intelligence in advance care planning: A scoping review","authors":"Minghui Tan , Siyuan Tang , Zhao Ni , Shichao Kan , Paul Macharia , Haojie Zhang , Hao Yi , Guo Li , Jinfeng Ding","doi":"10.1016/j.artmed.2025.103315","DOIUrl":"10.1016/j.artmed.2025.103315","url":null,"abstract":"<div><h3>Background</h3><div>Advance care planning (ACP) is a process that enables individuals to discuss future health care decisions before they become seriously ill or unable to communicate. Artificial intelligence (AI) has demonstrated promising outcomes in facilitating healthcare, offering the potential to facilitate ACP. However, the current status of using AI to facilitate ACP is unclear. This study aimed to investigate how AI has been leveraged to facilitate ACP, with a particular focus on the intended purposes, AI algorithms used, data sources, and the performance of AI in achieving the intended purposes.</div></div><div><h3>Methods</h3><div>The methodology employed in this study adhered to the Scoping Review Methodological Framework. PubMed, EMBASE, Web of Science, CINAHL, Cochrane Library, and IEEE Xplore databases were searched from their inception to July 2025. Descriptive analyses and narrative synthesis were used to summarize findings from the included studies.</div></div><div><h3>Results</h3><div>A total of 42 eligible studies were analyzed. The studies were primarily used to detect ACP conversations and documents, identify patients needing ACP, promote ACP education, and explore linguistic features in ACP conversations. Rule-based natural language processing emerged as the most commonly used AI algorithm, with textual data being the primary modality employed. The included studies exhibited significant variation in performance evaluation.</div></div><div><h3>Finding</h3><div>The current use of AI in ACP remains limited in scope, primarily focusing on extracting ACP documentation from electronic health records and identifying patients who may benefit from ACP. The use of advanced technologies such as generative AI is limited, and performance evaluation primarily relies on discrimination metrics.</div></div>","PeriodicalId":55458,"journal":{"name":"Artificial Intelligence in Medicine","volume":"172 ","pages":"Article 103315"},"PeriodicalIF":6.2,"publicationDate":"2025-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145685037","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-25DOI: 10.1016/j.artmed.2025.103316
Troy Gloyn , Christina Seo , Alexandra Godinho , Rahul Rahul , Siona Phadke , Hilary Fotheringham , Pete Wegier
<div><h3>Objective</h3><div>The purpose of this review was to comprehensively explore the landscape of recently published literature on the applications of artificial intelligence (AI) in predicting individualized patient waiting times in an emergency department (ED) and identify pertinent considerations for practitioners and hospital decision-makers.</div></div><div><h3>Introduction</h3><div>ED overcrowding is being experienced by hospitals around the globe and has worsened in the post COVID-19 era. The negative patient and staff experiences and poor clinical outcomes from overcrowding are evident and necessitate solutions to address this ongoing problem. Hospitals providing ED waiting time estimates to patients and staff are becoming popular; however, the more common methods, such as using rolling averages, suffer from an inability to capture the nuanced relationships within an ED. Recent applications of AI and machine learning (ML) in healthcare raises the possibility of applying these techniques to individualized waiting time predictions in the ED; although, literature on the topic is sparse.</div></div><div><h3>Methods</h3><div>A systematized search was conducted on November 10th, 2025, using the electronic databases CINAHL, EMBASE (OVID), Medline (OVID), PsychINFO, Web of Science, and PubMed. Articles were considered for review if written in English, peer-reviewed, published after 2014, and used AI techniques. Descriptive analysis was performed on the final extracted data to facilitate the identification of common themes across studies. Themes were inferred from the proportional usage among studies, of different data preparation, feature selection, and modeling strategies.</div></div><div><h3>Results</h3><div>The search identified 8613 citations that, after a rigorous screening process and critical appraisal, were narrowed down to 15 studies for final review. Most included studies were observational, using historical medical record data to compare modeling techniques or demonstrate a proof of concept. Studies commonly used one or more of ED queue-based, staff/resource-based, patient-based, and time-based feature categories. Incorporated AI methods included Random Forest, Linear Regression, and Least Absolute Shrinkage and Selection Operator (LASSO) techniques, among several others. All forms of AI and ML outperformed traditional rolling average estimates used by hospitals.</div></div><div><h3>Conclusions</h3><div>This review identified applications of AI in predicting individualized patient waiting times in the ED that outperform current waiting time estimate strategies. The use of nonlinear techniques, such as the Random Forest method, or incorporating queue-based feature categories, appeared to provide better performance in predictive estimates. Depending on the end user and modality in which the wait time estimate is conveyed, the importance of model selection is highlighted as a consideration to be made if overestimates or underestimate
目的:本综述的目的是全面探讨最近发表的关于人工智能(AI)在预测急诊室(ED)个性化患者等待时间方面的应用的文献,并确定从业人员和医院决策者的相关考虑因素。全球各地的医院都在经历着过度拥挤的情况,并且在后COVID-19时代恶化了。过度拥挤给患者和工作人员带来的负面体验以及糟糕的临床结果是显而易见的,需要解决这一持续存在的问题。向病人和工作人员提供急诊科等待时间估计的医院越来越受欢迎;然而,更常见的方法,如使用滚动平均,无法捕捉急诊科内部的细微关系。最近人工智能和机器学习(ML)在医疗保健领域的应用,提高了将这些技术应用于急诊科个性化等待时间预测的可能性;虽然,关于这个话题的文献很少。方法于2025年11月10日系统检索,检索的电子数据库为CINAHL、EMBASE (OVID)、Medline (OVID)、PsychINFO、Web of Science和PubMed。如果文章是用英文写的,经过同行评议,在2014年之后发表,并且使用了人工智能技术,则会被考虑进行审查。对最终提取的数据进行描述性分析,以促进识别研究中的共同主题。主题是从研究中不同数据准备、特征选择和建模策略的比例使用中推断出来的。经过严格的筛选过程和严格的评估,搜索确定了8613条引用,最终被缩小到15项研究。大多数纳入的研究是观察性的,使用历史医疗记录数据来比较建模技术或证明概念。研究通常使用一种或多种基于急诊队列的、基于员工/资源的、基于患者的和基于时间的特征类别。整合的人工智能方法包括随机森林、线性回归、最小绝对收缩和选择算子(LASSO)技术等。所有形式的人工智能和机器学习都优于医院使用的传统滚动平均估计。本综述确定了人工智能在预测急诊科个体化患者等待时间方面的应用,优于当前的等待时间估计策略。使用非线性技术,如随机森林方法,或结合基于队列的特征类别,似乎在预测估计中提供了更好的性能。根据最终用户和传递等待时间估计的方式,如果需要过高估计或过低估计,则强调模型选择的重要性。
{"title":"Using artificial intelligence to predict patient wait times in the emergency department: A scoping review","authors":"Troy Gloyn , Christina Seo , Alexandra Godinho , Rahul Rahul , Siona Phadke , Hilary Fotheringham , Pete Wegier","doi":"10.1016/j.artmed.2025.103316","DOIUrl":"10.1016/j.artmed.2025.103316","url":null,"abstract":"<div><h3>Objective</h3><div>The purpose of this review was to comprehensively explore the landscape of recently published literature on the applications of artificial intelligence (AI) in predicting individualized patient waiting times in an emergency department (ED) and identify pertinent considerations for practitioners and hospital decision-makers.</div></div><div><h3>Introduction</h3><div>ED overcrowding is being experienced by hospitals around the globe and has worsened in the post COVID-19 era. The negative patient and staff experiences and poor clinical outcomes from overcrowding are evident and necessitate solutions to address this ongoing problem. Hospitals providing ED waiting time estimates to patients and staff are becoming popular; however, the more common methods, such as using rolling averages, suffer from an inability to capture the nuanced relationships within an ED. Recent applications of AI and machine learning (ML) in healthcare raises the possibility of applying these techniques to individualized waiting time predictions in the ED; although, literature on the topic is sparse.</div></div><div><h3>Methods</h3><div>A systematized search was conducted on November 10th, 2025, using the electronic databases CINAHL, EMBASE (OVID), Medline (OVID), PsychINFO, Web of Science, and PubMed. Articles were considered for review if written in English, peer-reviewed, published after 2014, and used AI techniques. Descriptive analysis was performed on the final extracted data to facilitate the identification of common themes across studies. Themes were inferred from the proportional usage among studies, of different data preparation, feature selection, and modeling strategies.</div></div><div><h3>Results</h3><div>The search identified 8613 citations that, after a rigorous screening process and critical appraisal, were narrowed down to 15 studies for final review. Most included studies were observational, using historical medical record data to compare modeling techniques or demonstrate a proof of concept. Studies commonly used one or more of ED queue-based, staff/resource-based, patient-based, and time-based feature categories. Incorporated AI methods included Random Forest, Linear Regression, and Least Absolute Shrinkage and Selection Operator (LASSO) techniques, among several others. All forms of AI and ML outperformed traditional rolling average estimates used by hospitals.</div></div><div><h3>Conclusions</h3><div>This review identified applications of AI in predicting individualized patient waiting times in the ED that outperform current waiting time estimate strategies. The use of nonlinear techniques, such as the Random Forest method, or incorporating queue-based feature categories, appeared to provide better performance in predictive estimates. Depending on the end user and modality in which the wait time estimate is conveyed, the importance of model selection is highlighted as a consideration to be made if overestimates or underestimate","PeriodicalId":55458,"journal":{"name":"Artificial Intelligence in Medicine","volume":"171 ","pages":"Article 103316"},"PeriodicalIF":6.2,"publicationDate":"2025-11-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145624191","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-25DOI: 10.1016/j.artmed.2025.103313
Xiaoming Jiang , Guoying Ji , Ye Yan , Xiongjun Ye , Chao Liang , Bao Li , Wei Wang , Shudong Zhang , Lizhi Shao
The invasiveness prediction in renal cell carcinoma (RCC) is of significant importance for the decision of clinical surgical plans and the patients' prognosis. Currently, besides invasive pathological assessment, it mainly relies on observation through computed tomography (CT) imaging. However, limitations of human vision and qualitative descriptions restrict the accuracy of the diagnosis of renal sinus invasion (RSI). Recently, artificial intelligence approaches have shown promising prospects in cancer diagnosis. Due to the complex imaging characteristics of invasiveness, prediction models that only focus on tumor regions are inadequate, requiring comprehensive evaluation of intratumoral heterogeneity, peritumoral information, and the kidney in which the tumor resides. Therefore, in this study, we propose a context-aware heterogeneous graph neural network for multi-level description and invasiveness prediction in RCC. The superiority of the proposed model lies in its ability to integrate imaging features at multi-level, and to learn disturbance invariant features through a data-driven diffusion perturbation strategy. To evaluate the effectiveness and generalization of our model, we conduct extensive experiments on a multi-center dataset (including CT scan images of 437 patients) to compare our model with a series of state-of-the-art (SOTA) classification models. The experimental results show the superiority of our model for RSI classification (). Additionally, we also perform a comparative study with clinical experts, and the proposed method is significantly better than existing assessment methods and clinical experts (). In general, our work provides an effective assessment tool for automated diagnosis of RSI in RCC and also offers new insights for constructing more precise tumor prediction models.
{"title":"Context-aware heterogeneous graph neural network for multi-level description and invasiveness prediction in renal cell carcinoma","authors":"Xiaoming Jiang , Guoying Ji , Ye Yan , Xiongjun Ye , Chao Liang , Bao Li , Wei Wang , Shudong Zhang , Lizhi Shao","doi":"10.1016/j.artmed.2025.103313","DOIUrl":"10.1016/j.artmed.2025.103313","url":null,"abstract":"<div><div>The invasiveness prediction in renal cell carcinoma (RCC) is of significant importance for the decision of clinical surgical plans and the patients' prognosis. Currently, besides invasive pathological assessment, it mainly relies on observation through computed tomography (CT) imaging. However, limitations of human vision and qualitative descriptions restrict the accuracy of the diagnosis of renal sinus invasion (RSI). Recently, artificial intelligence approaches have shown promising prospects in cancer diagnosis. Due to the complex imaging characteristics of invasiveness, prediction models that only focus on tumor regions are inadequate, requiring comprehensive evaluation of intratumoral heterogeneity, peritumoral information, and the kidney in which the tumor resides. Therefore, in this study, we propose a context-aware heterogeneous graph neural network for multi-level description and invasiveness prediction in RCC. The superiority of the proposed model lies in its ability to integrate imaging features at multi-level, and to learn disturbance invariant features through a data-driven diffusion perturbation strategy. To evaluate the effectiveness and generalization of our model, we conduct extensive experiments on a multi-center dataset (including CT scan images of 437 patients) to compare our model with a series of state-of-the-art (SOTA) classification models. The experimental results show the superiority of our model for RSI classification (<span><math><mi>AUC</mi><mo>=</mo><mn>0.88</mn></math></span>). Additionally, we also perform a comparative study with clinical experts, and the proposed method is significantly better than existing assessment methods and clinical experts (<span><math><mi>p</mi><mo><</mo><mn>0.05</mn></math></span>). In general, our work provides an effective assessment tool for automated diagnosis of RSI in RCC and also offers new insights for constructing more precise tumor prediction models.</div></div>","PeriodicalId":55458,"journal":{"name":"Artificial Intelligence in Medicine","volume":"172 ","pages":"Article 103313"},"PeriodicalIF":6.2,"publicationDate":"2025-11-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145625344","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}