Ambulatory surgery enhances resource utilization through reduced hospital stays and costs without compromising clinical outcomes. However, existing workflows are labor-intensive and repetitive, necessitating optimization in patient selection, assessment, admission notifications, no-show management, patient education, and postoperative follow-up. Artificial intelligence (AI) offers promising solutions to these challenges. This narrative review aimed to outline current AI applications in ambulatory surgery, appraise limitations, and discuss actionable pathways for future innovation. The PubMed database was systematically searched. Inclusion criteria were original research on AI in ambulatory surgery. Exclusion criteria covered weak thematic connections and unavailable full texts. Two researchers independently conducted the search and data extraction. 50 articles were analyzed in this review. AI technologies, including machine learning, computer vision, and natural language processing, are increasingly used for preoperative patient selection and no-show prediction, intraoperative patient information verification, real-time monitoring and decision support, and postoperative recovery monitoring and health guidance. Nonetheless, AI implementation faces challenges such as data heterogeneity, algorithm interpretability, ethical concerns, and regulatory hurdles. AI demonstrates significant potential to optimize ambulatory surgery procedures, enhance clinical decision-making, and improve patient outcomes. Standardized data collection, collaborative data-sharing, transparency, and model validation with clinically meaningful endpoints are essential for robust and extensive AI application in ambulatory surgery. These elements can ultimately enhance the efficiency and safety of ambulatory surgical procedures.
{"title":"Artificial Intelligence in Ambulatory Surgery: Current Applications, Challenges, and Future Directions.","authors":"Lidi Liu, Peng Zhang, Yu Jia, Li Hou, Dongmei Peng, Zhichao Li, Peng Liang","doi":"10.1007/s10916-025-02286-w","DOIUrl":"https://doi.org/10.1007/s10916-025-02286-w","url":null,"abstract":"<p><p>Ambulatory surgery enhances resource utilization through reduced hospital stays and costs without compromising clinical outcomes. However, existing workflows are labor-intensive and repetitive, necessitating optimization in patient selection, assessment, admission notifications, no-show management, patient education, and postoperative follow-up. Artificial intelligence (AI) offers promising solutions to these challenges. This narrative review aimed to outline current AI applications in ambulatory surgery, appraise limitations, and discuss actionable pathways for future innovation. The PubMed database was systematically searched. Inclusion criteria were original research on AI in ambulatory surgery. Exclusion criteria covered weak thematic connections and unavailable full texts. Two researchers independently conducted the search and data extraction. 50 articles were analyzed in this review. AI technologies, including machine learning, computer vision, and natural language processing, are increasingly used for preoperative patient selection and no-show prediction, intraoperative patient information verification, real-time monitoring and decision support, and postoperative recovery monitoring and health guidance. Nonetheless, AI implementation faces challenges such as data heterogeneity, algorithm interpretability, ethical concerns, and regulatory hurdles. AI demonstrates significant potential to optimize ambulatory surgery procedures, enhance clinical decision-making, and improve patient outcomes. Standardized data collection, collaborative data-sharing, transparency, and model validation with clinically meaningful endpoints are essential for robust and extensive AI application in ambulatory surgery. These elements can ultimately enhance the efficiency and safety of ambulatory surgical procedures.</p>","PeriodicalId":16338,"journal":{"name":"Journal of Medical Systems","volume":"49 1","pages":"146"},"PeriodicalIF":5.7,"publicationDate":"2025-10-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145377741","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-23DOI: 10.1007/s10916-025-02252-6
Sara Montagna, Rita Stagni, Giada Pierucci, Arianna Aceti, Duccio Maria Cordelli, Maria Cristina Bisi
Preterm birth leads to an increased risk of long-term consequences, with over 50% of children born <30 weeks facing motor, cognitive, or behavioural impairments. Early monitoring of motor developmental trajectories, strongly associated with neurodevelopmental outcome, is crucial for a timely identification of deviations from the reference path and the prediction of possible neurodevelopmental disorders (NDDs). However, the current understanding of the causal pathways through which motor difficulties emerge and evolve is limited by the lack of quantitative, standardised, and interpretative measures for infant motor development, and the need for a complex multidisciplinary examination of medical history. To overcome these limitations, we propose an approach based on Digital Twins (DTs) and innovative technology-based interpretative metrics for motor assessment to support holistic longitudinal evaluations of infant development. The DT enables the integration of multimodal data, including algorithms for data processing and artificial intelligence methods for data analysis, into a unique framework. Details on the DT ecosystem, internal model, and engine are provided. As a first step, a proof-of-concept application was implemented to show the feasibility of the framework, not yet exploring its full longitudinal potential. This initial study was based on already published data (17 full-term children, 21 preterm children born between 29 and 36 gestational weeks, and 8 very preterm children born ≤28 gestational weeks) and illustrates the integration of motor measures with clinical and cognitive information, their standardisation into the DT model, and a first set of advanced analyses. Given the relevance of the problem and the lack of standardised, structured follow-up protocols to monitor motor trajectory in preterm children, the proposed solution has the potential for a significant impact in clinical practice. Moreover, its usable and scalable design allows for easy adaptation to large, multi-center cohort studies targeting various infant clinical populations where motor function monitoring is essential (i.e. from children with rare neurological disorders to all newborns).
早产导致长期后果的风险增加,超过50%的儿童出生
{"title":"Digital Twins for Monitoring Neuromotor Development in Preterm Infants: Conceptual Framework and Proof-of-concept Study.","authors":"Sara Montagna, Rita Stagni, Giada Pierucci, Arianna Aceti, Duccio Maria Cordelli, Maria Cristina Bisi","doi":"10.1007/s10916-025-02252-6","DOIUrl":"10.1007/s10916-025-02252-6","url":null,"abstract":"<p><p>Preterm birth leads to an increased risk of long-term consequences, with over 50% of children born <30 weeks facing motor, cognitive, or behavioural impairments. Early monitoring of motor developmental trajectories, strongly associated with neurodevelopmental outcome, is crucial for a timely identification of deviations from the reference path and the prediction of possible neurodevelopmental disorders (NDDs). However, the current understanding of the causal pathways through which motor difficulties emerge and evolve is limited by the lack of quantitative, standardised, and interpretative measures for infant motor development, and the need for a complex multidisciplinary examination of medical history. To overcome these limitations, we propose an approach based on Digital Twins (DTs) and innovative technology-based interpretative metrics for motor assessment to support holistic longitudinal evaluations of infant development. The DT enables the integration of multimodal data, including algorithms for data processing and artificial intelligence methods for data analysis, into a unique framework. Details on the DT ecosystem, internal model, and engine are provided. As a first step, a proof-of-concept application was implemented to show the feasibility of the framework, not yet exploring its full longitudinal potential. This initial study was based on already published data (17 full-term children, 21 preterm children born between 29 and 36 gestational weeks, and 8 very preterm children born ≤28 gestational weeks) and illustrates the integration of motor measures with clinical and cognitive information, their standardisation into the DT model, and a first set of advanced analyses. Given the relevance of the problem and the lack of standardised, structured follow-up protocols to monitor motor trajectory in preterm children, the proposed solution has the potential for a significant impact in clinical practice. Moreover, its usable and scalable design allows for easy adaptation to large, multi-center cohort studies targeting various infant clinical populations where motor function monitoring is essential (i.e. from children with rare neurological disorders to all newborns).</p>","PeriodicalId":16338,"journal":{"name":"Journal of Medical Systems","volume":"49 1","pages":"143"},"PeriodicalIF":5.7,"publicationDate":"2025-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12549732/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145345731","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-20DOI: 10.1007/s10916-025-02274-0
Valerii A Zuev, Elena G Salmagambetova, Stepan N Djakov, Lev V Utkin
Video-electroencephalography (vEEG) monitoring is currently the reference standard in the diagnosis of epilepsy. Manual analysis of vEEG recordings is time-consuming and inter-rater agreement is low even when the annotation is done by experienced doctors; therefore, there is a need for automated, standardized methods for vEEG annotation. Recent advances in machine learning have shown promise in real-time epileptiform discharge detection, as well as seizure detection and prediction using EEG and video data. However, the diversity of seizure symptoms, markup ambiguities, and the limited availability of multimodal datasets hinder progress. This paper reviews the latest developments in automated video-EEG analysis and discusses the integration of multimodal data, focusing on research published in 2024 and the beginning of 2025. We also propose a novel pipeline for explainable treatment effect estimation from vEEG data using concept-based learning, offering a pathway for future research in this domain.
{"title":"Automated Video-EEG Analysis in Epilepsy Studies: A Narrative Review of Advances and Challenges.","authors":"Valerii A Zuev, Elena G Salmagambetova, Stepan N Djakov, Lev V Utkin","doi":"10.1007/s10916-025-02274-0","DOIUrl":"https://doi.org/10.1007/s10916-025-02274-0","url":null,"abstract":"<p><p>Video-electroencephalography (vEEG) monitoring is currently the reference standard in the diagnosis of epilepsy. Manual analysis of vEEG recordings is time-consuming and inter-rater agreement is low even when the annotation is done by experienced doctors; therefore, there is a need for automated, standardized methods for vEEG annotation. Recent advances in machine learning have shown promise in real-time epileptiform discharge detection, as well as seizure detection and prediction using EEG and video data. However, the diversity of seizure symptoms, markup ambiguities, and the limited availability of multimodal datasets hinder progress. This paper reviews the latest developments in automated video-EEG analysis and discusses the integration of multimodal data, focusing on research published in 2024 and the beginning of 2025. We also propose a novel pipeline for explainable treatment effect estimation from vEEG data using concept-based learning, offering a pathway for future research in this domain.</p>","PeriodicalId":16338,"journal":{"name":"Journal of Medical Systems","volume":"49 1","pages":"142"},"PeriodicalIF":5.7,"publicationDate":"2025-10-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145329500","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-18DOI: 10.1007/s10916-025-02284-y
Lei Xu, Wenzhe Zhao, Xin Huang
General-purpose large language models (LLMs) are increasingly proposed for diagnostic and triage decision support, yet their reliability relative to humans remains unclear. We evaluated eight contemporary LLMs (ChatGPT-4, ChatGPT-o1, DeepSeek-V3, DeepSeek-R1, Gemini-2.0, Copilot, Grok-2, Llama-3.1) on 48 single-turn clinical vignettes spanning four triage levels (Emergent, 1-day, 1-week, Self-care). Models were tested without prompts and with structured prompts comprising exemplar cases. Primary outcomes were diagnostic and triage accuracy. Secondary measures included confusion matrices, over-triage, safety of advice, and the Capability Comparison Score (CCS). Structured prompting improved performance across models: mean diagnostic accuracy increased from 89.84% to 91.67%, and mean triage accuracy increased from 76.82% to 86.20%. The best diagnostic accuracy was 93.75% (ChatGPT-o1 and DeepSeek-R1; Grok-2 matched this when prompted). Prompting shifted models toward safety: safety of advice rose from 89.06% to 94.53%, accompanied by higher over-triage (from 53.15% to 65.62%). CCS values were numerically lower than accuracy but preserved rankings and conclusions (diagnosis CCS: from 49.54 to 50.46; triage CCS: from 47.66 to 52.34). Error analyses showed predominant over-triage, with rarer but clinically important under-triage. On concise, text-only vignettes, the diagnostic accuracy of advanced LLMs was high, in some cases nearing benchmarks set by physicians in prior studies, whereas triage remained a more significant challenge. Structured prompting provided a practical, training-free lever to enhance robustness. Future work should evaluate uncertainty-aware prompting and real-world, multi-turn/multi-modality cases to strengthen clinical reliability.
{"title":"Diagnosis and Triage Performance of Contemporary Large Language Models on Short Clinical Vignettes.","authors":"Lei Xu, Wenzhe Zhao, Xin Huang","doi":"10.1007/s10916-025-02284-y","DOIUrl":"10.1007/s10916-025-02284-y","url":null,"abstract":"<p><p>General-purpose large language models (LLMs) are increasingly proposed for diagnostic and triage decision support, yet their reliability relative to humans remains unclear. We evaluated eight contemporary LLMs (ChatGPT-4, ChatGPT-o1, DeepSeek-V3, DeepSeek-R1, Gemini-2.0, Copilot, Grok-2, Llama-3.1) on 48 single-turn clinical vignettes spanning four triage levels (Emergent, 1-day, 1-week, Self-care). Models were tested without prompts and with structured prompts comprising exemplar cases. Primary outcomes were diagnostic and triage accuracy. Secondary measures included confusion matrices, over-triage, safety of advice, and the Capability Comparison Score (CCS). Structured prompting improved performance across models: mean diagnostic accuracy increased from 89.84% to 91.67%, and mean triage accuracy increased from 76.82% to 86.20%. The best diagnostic accuracy was 93.75% (ChatGPT-o1 and DeepSeek-R1; Grok-2 matched this when prompted). Prompting shifted models toward safety: safety of advice rose from 89.06% to 94.53%, accompanied by higher over-triage (from 53.15% to 65.62%). CCS values were numerically lower than accuracy but preserved rankings and conclusions (diagnosis CCS: from 49.54 to 50.46; triage CCS: from 47.66 to 52.34). Error analyses showed predominant over-triage, with rarer but clinically important under-triage. On concise, text-only vignettes, the diagnostic accuracy of advanced LLMs was high, in some cases nearing benchmarks set by physicians in prior studies, whereas triage remained a more significant challenge. Structured prompting provided a practical, training-free lever to enhance robustness. Future work should evaluate uncertainty-aware prompting and real-world, multi-turn/multi-modality cases to strengthen clinical reliability.</p>","PeriodicalId":16338,"journal":{"name":"Journal of Medical Systems","volume":"49 1","pages":"141"},"PeriodicalIF":5.7,"publicationDate":"2025-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12535515/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145313085","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-16DOI: 10.1007/s10916-025-02273-1
Antonio Balordi, Alice Bernasconi, Alessandra Andreotti, Stefano Guzzinati, Rafael Cabañas De Paz, Alessio Zanga
Cancer treatments might lead to several long-term effects. In this work we investigate their causal role on ischemic heart disease and their potential precursors (i.e. hypertension and dyslipidemia) of the ovarian suppression therapy in adolescent and young adult (AYA) breast cancer (BC) survivors. Additionally, we assess the external validity of our findings through comparative analysis of regional data. We take advantage of a causal network model that leverage on observational data on 1-year AYA BC survivors living the Lombardy region in Italy. Using a structural causal model (SCM) and counterfactual analysis within Pearl's causal inference framework, we estimate the Average Causal Effect (ACE), Probability of Necessity (PN), and Probability of Sufficiency (PS) for the cause-effect relationships. Data of a regional cohort of AYA BC patients living in the Veneto region were used to externally validate results. Ovarian suppression was found to be a necessary but not sufficient cause for ischemic heart disease (PN > 97.8%; PS < 1.97%). While PN is high for both hypertension and dyslipidemia, PS varied suggesting ovarian suppression alone could induce hypertension in about 30% of cases but was rarely sufficient for dyslipidemia onset. External validation confirmed the robustness of findings across regions. Our experimental results may be of interest for clinicians who aim at personalizing the follow-up of AYA BC survivors, with particular attention to be paid in monitoring the hypertension onset or in its prevention. The study demonstrates the value of counterfactual reasoning and causal inference when working with real-world data.
{"title":"On Counterfactual Explanations of Cardiovascular Risk in Adolescent and Young Adult Breast Cancer Survivors.","authors":"Antonio Balordi, Alice Bernasconi, Alessandra Andreotti, Stefano Guzzinati, Rafael Cabañas De Paz, Alessio Zanga","doi":"10.1007/s10916-025-02273-1","DOIUrl":"10.1007/s10916-025-02273-1","url":null,"abstract":"<p><p>Cancer treatments might lead to several long-term effects. In this work we investigate their causal role on ischemic heart disease and their potential precursors (i.e. hypertension and dyslipidemia) of the ovarian suppression therapy in adolescent and young adult (AYA) breast cancer (BC) survivors. Additionally, we assess the external validity of our findings through comparative analysis of regional data. We take advantage of a causal network model that leverage on observational data on 1-year AYA BC survivors living the Lombardy region in Italy. Using a structural causal model (SCM) and counterfactual analysis within Pearl's causal inference framework, we estimate the Average Causal Effect (ACE), Probability of Necessity (PN), and Probability of Sufficiency (PS) for the cause-effect relationships. Data of a regional cohort of AYA BC patients living in the Veneto region were used to externally validate results. Ovarian suppression was found to be a necessary but not sufficient cause for ischemic heart disease (PN > 97.8%; PS < 1.97%). While PN is high for both hypertension and dyslipidemia, PS varied suggesting ovarian suppression alone could induce hypertension in about 30% of cases but was rarely sufficient for dyslipidemia onset. External validation confirmed the robustness of findings across regions. Our experimental results may be of interest for clinicians who aim at personalizing the follow-up of AYA BC survivors, with particular attention to be paid in monitoring the hypertension onset or in its prevention. The study demonstrates the value of counterfactual reasoning and causal inference when working with real-world data.</p>","PeriodicalId":16338,"journal":{"name":"Journal of Medical Systems","volume":"49 1","pages":"140"},"PeriodicalIF":5.7,"publicationDate":"2025-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12532745/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145301381","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-14DOI: 10.1007/s10916-025-02278-w
Neda Aminnejad, Emmalin Buajitti, Laura C Rosella, Huaxiong Huang
This study presents a machine learning-driven model predicting all-cause mortality two years in advance using administrative health data focused on diabetic patients. Integrating hospitalization records, emergency department data, demographics, and chronic disease information for 1553 variables, the study utilizes XGBoost, achieving an AUC of 0.89, which comparatively surpasses existing models. The research emphasizes the machine learning model's efficacy in capturing intricate mortality risk relationships and highlighting risk factors. While prior models often relied on specific cohorts or limited variables, this model, based on commonly available variables in primary care data, displays robust discrimination and calibration. Additionally, it highlights significant predictors such as age, immigration status, diagnosis age of comorbidities, number of comorbidities, and durations of comorbidities, aiding in early risk identification. The study suggests a potential for enhanced patient management and resource allocation based on mortality risk predictions for diabetic populations, showcasing the impact of machine learning in healthcare.
{"title":"Predicting All-Cause Mortality in Diabetic Patients 2 Years in Advance Using Aggregated EHR Data and Machine Learning.","authors":"Neda Aminnejad, Emmalin Buajitti, Laura C Rosella, Huaxiong Huang","doi":"10.1007/s10916-025-02278-w","DOIUrl":"https://doi.org/10.1007/s10916-025-02278-w","url":null,"abstract":"<p><p>This study presents a machine learning-driven model predicting all-cause mortality two years in advance using administrative health data focused on diabetic patients. Integrating hospitalization records, emergency department data, demographics, and chronic disease information for 1553 variables, the study utilizes XGBoost, achieving an AUC of 0.89, which comparatively surpasses existing models. The research emphasizes the machine learning model's efficacy in capturing intricate mortality risk relationships and highlighting risk factors. While prior models often relied on specific cohorts or limited variables, this model, based on commonly available variables in primary care data, displays robust discrimination and calibration. Additionally, it highlights significant predictors such as age, immigration status, diagnosis age of comorbidities, number of comorbidities, and durations of comorbidities, aiding in early risk identification. The study suggests a potential for enhanced patient management and resource allocation based on mortality risk predictions for diabetic populations, showcasing the impact of machine learning in healthcare.</p>","PeriodicalId":16338,"journal":{"name":"Journal of Medical Systems","volume":"49 1","pages":"139"},"PeriodicalIF":5.7,"publicationDate":"2025-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145286350","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-13DOI: 10.1007/s10916-025-02253-5
Edo Septian, Muhammad Rizal Khaefi, Achmad Athoillah, Dewi Nur Aisyah, Muhammad Hardhantyo, Fauziah Mauly Rahman, Logan Manikam
This study aims to enhance individual hypertension risk prediction in Indonesia using machine learning (ML) models. The research investigates the predictive accuracy of models with and without incorporating personal hypertension history, seeking to understand how data limitations impact model performance in a low-resource setting. Data from the SATUSEHAT IndonesiaKu (ASIK) system were preprocessed and filtered to create a dataset of 9.58 million adult health records. Two primary model variations were compared: Model A (incorporating patient history) and Model B (excluding patient history). We evaluated the model using five algorithms: XGBoost, LightGBM, CatBoost, Logistic Regression, and Random Forest. Model performance was assessed using the Area Under the Curve (AUC), sensitivity, and specificity metrics. Model A achieved superior predictive accuracy (AUC = 0.85) compared to Model B (AUC = 0.78). To mitigate potential bias, Model B was selected for further in-depth development. Evaluation of model B reveals that XGBoost and LightGBM algorithm achieved the highest performance (AUC 0.78) and LightGBM emerged as the best algorithm based on its performance. SHAP analysis was conducted and identified key predictors such as age, family history of hypertension, body weight, and waist circumference. This study finds that while a patient's personal history of hypertension significantly enhances predictive accuracy, robust ML models can effectively predict hypertension risk using other accessible demographic, clinical, and lifestyle features. Model B offers a valuable and generalizable approach for broader risk screening, particularly where patient history may be unavailable or unreliable, while also providing insights into key modifiable and non-modifiable determinants of hypertension.
本研究旨在利用机器学习(ML)模型增强印度尼西亚个体高血压风险预测。该研究调查了纳入和不纳入个人高血压病史的模型的预测准确性,试图了解在低资源环境下数据限制如何影响模型的性能。来自SATUSEHAT indonesia (ASIK)系统的数据经过预处理和过滤,创建了958万份成人健康记录的数据集。比较了两种主要的模型变化:模型A(包含病史)和模型B(不包含病史)。我们使用五种算法对模型进行评估:XGBoost、LightGBM、CatBoost、Logistic回归和随机森林。使用曲线下面积(AUC)、敏感性和特异性指标评估模型性能。模型A的预测精度(AUC = 0.85)优于模型B (AUC = 0.78)。为了减少潜在的偏见,选择模型B进行进一步的深入开发。对模型B的评价表明,XGBoost和LightGBM算法的性能最高(AUC为0.78),而LightGBM算法的性能表现最佳。进行了SHAP分析并确定了关键的预测因素,如年龄、高血压家族史、体重和腰围。本研究发现,虽然患者的高血压个人病史显著提高了预测准确性,但鲁棒的ML模型可以利用其他可获得的人口统计学、临床和生活方式特征有效地预测高血压风险。模型B为更广泛的风险筛查提供了一种有价值和可推广的方法,特别是在患者病史可能不可获得或不可靠的情况下,同时也为高血压的关键可改变和不可改变的决定因素提供了见解。
{"title":"Prediction of Personalised Hypertension Using Machine Learning in Indonesian Population.","authors":"Edo Septian, Muhammad Rizal Khaefi, Achmad Athoillah, Dewi Nur Aisyah, Muhammad Hardhantyo, Fauziah Mauly Rahman, Logan Manikam","doi":"10.1007/s10916-025-02253-5","DOIUrl":"10.1007/s10916-025-02253-5","url":null,"abstract":"<p><p>This study aims to enhance individual hypertension risk prediction in Indonesia using machine learning (ML) models. The research investigates the predictive accuracy of models with and without incorporating personal hypertension history, seeking to understand how data limitations impact model performance in a low-resource setting. Data from the SATUSEHAT IndonesiaKu (ASIK) system were preprocessed and filtered to create a dataset of 9.58 million adult health records. Two primary model variations were compared: Model A (incorporating patient history) and Model B (excluding patient history). We evaluated the model using five algorithms: XGBoost, LightGBM, CatBoost, Logistic Regression, and Random Forest. Model performance was assessed using the Area Under the Curve (AUC), sensitivity, and specificity metrics. Model A achieved superior predictive accuracy (AUC = 0.85) compared to Model B (AUC = 0.78). To mitigate potential bias, Model B was selected for further in-depth development. Evaluation of model B reveals that XGBoost and LightGBM algorithm achieved the highest performance (AUC 0.78) and LightGBM emerged as the best algorithm based on its performance. SHAP analysis was conducted and identified key predictors such as age, family history of hypertension, body weight, and waist circumference. This study finds that while a patient's personal history of hypertension significantly enhances predictive accuracy, robust ML models can effectively predict hypertension risk using other accessible demographic, clinical, and lifestyle features. Model B offers a valuable and generalizable approach for broader risk screening, particularly where patient history may be unavailable or unreliable, while also providing insights into key modifiable and non-modifiable determinants of hypertension.</p>","PeriodicalId":16338,"journal":{"name":"Journal of Medical Systems","volume":"49 1","pages":"137"},"PeriodicalIF":5.7,"publicationDate":"2025-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12515743/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145280244","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-13DOI: 10.1007/s10916-025-02257-1
David B Wax, Muoi A Trinh
{"title":"Anesthesia Point-of-Care Nitrous Oxide Cylinder Leakage and a Proposed Engineering Control Solution.","authors":"David B Wax, Muoi A Trinh","doi":"10.1007/s10916-025-02257-1","DOIUrl":"https://doi.org/10.1007/s10916-025-02257-1","url":null,"abstract":"","PeriodicalId":16338,"journal":{"name":"Journal of Medical Systems","volume":"49 1","pages":"138"},"PeriodicalIF":5.7,"publicationDate":"2025-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145280311","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This study aimed to develop machine learning-based algorithms to assist physicians in ultrasound-guided localization of the cricoid cartilage (CC), thyroid cartilage (TC), and cricothyroid membrane (CTM) for cricothyroidotomy. Adult female participants presenting to the emergency department with dyspnea or to the obstetrics and gynecology department for a scheduled cesarean section between August 2022 and July 2024 were prospectively recruited. Ultrasonographic images were collected using a wireless handheld ultrasound device connected to an edge computing tablet. Three You Only Look Once (YOLO) model variants-v5n6, v8n, and v10n-were selected for development and evaluation. A total of 608 participants (median age: 58.0 years, interquartile range [IQR]: 40.0-73.0; median body mass index: 23.2 kg/m², IQR: 20.2-26.5) contributed 117,094 ultrasonographic frames. All three YOLO-based models demonstrated high accuracy in detecting CC, TC, and CTM, with area under the receiver operating characteristic curve values exceeding 0.88. In correctly identified frames, the models effectively localized CC (IOU values: YOLOv5n6, 0.713 [95% confidence interval (CI): 0.698-0.726]; YOLOv8n, 0.718 [95% CI: 0.702-0.733]; YOLOv10n, 0.718 [95% CI: 0.701-0.734]; p value: 0.03) and TC (YOLOv5n6, 0.700 [95% CI: 0.683-0.717]; YOLOv8n, 0.706 [95% CI: 0.687-0.725]; YOLOv10n, 0.703 [95% CI: 0.783-0.721] ; p value: 0.037), though localization accuracy was lower for CTM (YOLOv5n6, 0.364 [95% CI: 0.333-0.394]; YOLOv8n, 0.363 [95% CI: 0.331-0.394]; YOLOv10n, 0.354 [95% CI: 0.325-0.381] ; p value: 0.053). The mean frames per second for YOLOv5n6, YOLOv8n, and YOLOv10n were 3.67, 13.83, and 14.13, respectively, when deployed on the handheld ultrasound platform. YOLO-based models demonstrated high accuracy in detecting and localizing CC, TC, and CTM. YOLOv8n and YOLOv10n achieved clinically acceptable real-time imaging performance when deployed on a wireless handheld ultrasound device with an edge computing tablet. Further studies are needed to assess whether this favorable performance translates into actual clinical benefits.
{"title":"Real-Time Identification of Cricothyrotomy Landmarks in Emergency Care and Obstetric Patients Using Wireless Handheld Ultrasound and Edge-Computing Artificial Intelligence: A Prospective Observational Study.","authors":"Cheng-Yi Wu, Jia-Da Li, Po-Yuan Shih, Cheng-Chia Huang, Hsiao-Liang Cheng, Chun-Yu Wu, Joyce Tay, Meng-Che Wu, Chih-Hung Wang, Chu-Song Chen, Chien-Hua Huang","doi":"10.1007/s10916-025-02275-z","DOIUrl":"10.1007/s10916-025-02275-z","url":null,"abstract":"<p><p>This study aimed to develop machine learning-based algorithms to assist physicians in ultrasound-guided localization of the cricoid cartilage (CC), thyroid cartilage (TC), and cricothyroid membrane (CTM) for cricothyroidotomy. Adult female participants presenting to the emergency department with dyspnea or to the obstetrics and gynecology department for a scheduled cesarean section between August 2022 and July 2024 were prospectively recruited. Ultrasonographic images were collected using a wireless handheld ultrasound device connected to an edge computing tablet. Three You Only Look Once (YOLO) model variants-v5n6, v8n, and v10n-were selected for development and evaluation. A total of 608 participants (median age: 58.0 years, interquartile range [IQR]: 40.0-73.0; median body mass index: 23.2 kg/m², IQR: 20.2-26.5) contributed 117,094 ultrasonographic frames. All three YOLO-based models demonstrated high accuracy in detecting CC, TC, and CTM, with area under the receiver operating characteristic curve values exceeding 0.88. In correctly identified frames, the models effectively localized CC (IOU values: YOLOv5n6, 0.713 [95% confidence interval (CI): 0.698-0.726]; YOLOv8n, 0.718 [95% CI: 0.702-0.733]; YOLOv10n, 0.718 [95% CI: 0.701-0.734]; p value: 0.03) and TC (YOLOv5n6, 0.700 [95% CI: 0.683-0.717]; YOLOv8n, 0.706 [95% CI: 0.687-0.725]; YOLOv10n, 0.703 [95% CI: 0.783-0.721] ; p value: 0.037), though localization accuracy was lower for CTM (YOLOv5n6, 0.364 [95% CI: 0.333-0.394]; YOLOv8n, 0.363 [95% CI: 0.331-0.394]; YOLOv10n, 0.354 [95% CI: 0.325-0.381] ; p value: 0.053). The mean frames per second for YOLOv5n6, YOLOv8n, and YOLOv10n were 3.67, 13.83, and 14.13, respectively, when deployed on the handheld ultrasound platform. YOLO-based models demonstrated high accuracy in detecting and localizing CC, TC, and CTM. YOLOv8n and YOLOv10n achieved clinically acceptable real-time imaging performance when deployed on a wireless handheld ultrasound device with an edge computing tablet. Further studies are needed to assess whether this favorable performance translates into actual clinical benefits.</p>","PeriodicalId":16338,"journal":{"name":"Journal of Medical Systems","volume":"49 1","pages":"131"},"PeriodicalIF":5.7,"publicationDate":"2025-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145258353","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-10DOI: 10.1007/s10916-025-02262-4
Emre Sahin, Eda Akman Aydin
Parkinson's disease (PD) is a prevalent and complex neurodegenerative disorder, with early diagnosis playing a critical role in timely treatment and management. Handwriting dynamics has emerged as a promising biomarker for early detection of PD, yet current diagnostic methods often lack precision and robustness. This study introduces a novel multimodal deep learning-based decision support system to enhance PD diagnosis. Our approach leverages static and dynamic features of handwriting data by combining images of handwritten drawings with fused time-frequency representations of grip pressure, axial pressure, tilt, and accelerometer signals from the y- and z-axes recorded during handwriting. The time-frequency transformations employ Short-Time Fourier Transform (STFT) and Continuous Wavelet Transform (CWT) to generate spectrograms and scalograms. Results demonstrate that fusing STFT spectrograms achieves an accuracy of 85.41%, which improves to 97.92% when integrated into the multimodal CNN model. Similarly, fusing CWT scalograms achieves 92.08% accuracy, further enhanced to 96.66% with the multimodal approach. These findings highlight that fused time-frequency representations yield successful results for PD diagnosis. Furthermore, the CWT-based approach demonstrates superior performance compared to STFT. Finally, integrating fused time-frequency images with visualizations further improves the accuracy rates. We incorporate the Gradient-weighted Class Activation Mapping++(Grad-CAM++) eXplainable Artificial Intelligence (XAI) method to ensure interpretability, highlighting attention regions within the fused STFT and CWT images. These attention regions effectively differentiate between healthy controls (HC) and PD patients. Although the model achieved promising results on the NewHandPD dataset, further external validation on diverse and multi-center datasets is required to confirm its generalizability and clinical applicability. The findings underscore the potential of integrating handwriting-based static and dynamic features for high-precision PD diagnosis, offering a robust and explainable framework for clinical decision-making.
{"title":"A Multimodal Convolutional Neural Network Model for Parkinson's Disease Diagnosis Based on Fused Handwriting Dynamics Signals.","authors":"Emre Sahin, Eda Akman Aydin","doi":"10.1007/s10916-025-02262-4","DOIUrl":"https://doi.org/10.1007/s10916-025-02262-4","url":null,"abstract":"<p><p>Parkinson's disease (PD) is a prevalent and complex neurodegenerative disorder, with early diagnosis playing a critical role in timely treatment and management. Handwriting dynamics has emerged as a promising biomarker for early detection of PD, yet current diagnostic methods often lack precision and robustness. This study introduces a novel multimodal deep learning-based decision support system to enhance PD diagnosis. Our approach leverages static and dynamic features of handwriting data by combining images of handwritten drawings with fused time-frequency representations of grip pressure, axial pressure, tilt, and accelerometer signals from the y- and z-axes recorded during handwriting. The time-frequency transformations employ Short-Time Fourier Transform (STFT) and Continuous Wavelet Transform (CWT) to generate spectrograms and scalograms. Results demonstrate that fusing STFT spectrograms achieves an accuracy of 85.41%, which improves to 97.92% when integrated into the multimodal CNN model. Similarly, fusing CWT scalograms achieves 92.08% accuracy, further enhanced to 96.66% with the multimodal approach. These findings highlight that fused time-frequency representations yield successful results for PD diagnosis. Furthermore, the CWT-based approach demonstrates superior performance compared to STFT. Finally, integrating fused time-frequency images with visualizations further improves the accuracy rates. We incorporate the Gradient-weighted Class Activation Mapping++(Grad-CAM++) eXplainable Artificial Intelligence (XAI) method to ensure interpretability, highlighting attention regions within the fused STFT and CWT images. These attention regions effectively differentiate between healthy controls (HC) and PD patients. Although the model achieved promising results on the NewHandPD dataset, further external validation on diverse and multi-center datasets is required to confirm its generalizability and clinical applicability. The findings underscore the potential of integrating handwriting-based static and dynamic features for high-precision PD diagnosis, offering a robust and explainable framework for clinical decision-making.</p>","PeriodicalId":16338,"journal":{"name":"Journal of Medical Systems","volume":"49 1","pages":"132"},"PeriodicalIF":5.7,"publicationDate":"2025-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145258288","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}