Pub Date : 2026-03-09eCollection Date: 2026-01-01DOI: 10.3389/fdgth.2026.1783347
Dao-Rong Hong, Chun-Yan Huang, Jiu Gao
Background: Large language models (LLMs) have shown growing potential for medical education and assessment, but evidence on their performance in specialty certification exams in China-particularly in ultrasound medicine-remains limited.
Objective: To compare the performance of ChatGPT-5 and DeepSeek on the Chinese Ultrasound Medicine Senior Professional Title Examination, overall and by item type.
Methods: Between August and September 2025, we randomly selected 100 multiple-choice questions from the official Chinese Ultrasound Medicine Senior Professional Title Examination bank (60 image-based interpretation items and 40 text-based items). We evaluated ChatGPT-5 and DeepSeek using identical prompts through their public web interfaces. The primary outcome was overall accuracy; secondary outcomes were accuracy by item type and subspecialty. Between-model differences were assessed using two-proportion z-tests (α = 0.05) in Python 3.12.
Results: Overall accuracy was higher for ChatGPT-5 than for DeepSeek [74.0% (74/100) vs. 60.0% (60/100); p = 0.035]. Accuracy on image-based items was also higher for ChatGPT-5 (61.7% vs. 40.0%; p = 0.018). Performance on text-based items was similar for both models (92.5% vs. 90.0%). Subspecialty patterns varied across domains; however, no between-model differences reached statistical significance.
Conclusions: ChatGPT-5 outperformed DeepSeek on image-based items (61.7% vs. 40.0%), while both models performed similarly on text-based knowledge items (92.5% vs. 90.0%). Overall, both LLMs showed strong performance on Chinese ultrasound senior-title examination questions, with complementary strengths across content areas. They may be useful as supplementary educational tools, but further advances in multimodal reasoning are needed to support more reliable image interpretation.
背景:大型语言模型(llm)在医学教育和评估方面显示出越来越大的潜力,但它们在中国的专业认证考试中的表现证据仍然有限,特别是在超声医学方面。目的:比较ChatGPT-5与DeepSeek在中超医学高级职称考试中的整体及分项表现。方法:于2025年8月至9月,从官方中医超声医学高级职称考试题库中随机抽取100道选择题(图像解译题60道,文字解译题40道)。我们通过ChatGPT-5和DeepSeek的公共网络界面使用相同的提示来评估它们。主要结局是总体准确性;次要结果是项目类型和亚专业的准确性。在Python 3.12中采用双比例z检验(α = 0.05)评估模型间差异。结果:ChatGPT-5的总体准确率高于DeepSeek [74.0% (74/100) vs. 60.0% (60/100)];p = 0.035]。ChatGPT-5在基于图像的项目上的准确率也更高(61.7% vs. 40.0%; p = 0.018)。两种模型在基于文本的项目上的表现相似(92.5%对90.0%)。亚专业模式在不同领域有所不同;但模型间差异无统计学意义。结论:ChatGPT-5在基于图像的项目上的表现优于DeepSeek(61.7%对40.0%),而两个模型在基于文本的知识项目上的表现相似(92.5%对90.0%)。总体而言,两位法学硕士在中国超声高级职称考试问题上表现出色,在内容领域上具有互补优势。它们可能是有用的辅助教育工具,但需要在多模态推理方面取得进一步进展,以支持更可靠的图像解释。
{"title":"Comparative performance of ChatGPT-5 and DeepSeek on the Chinese ultrasound medicine senior professional title examination.","authors":"Dao-Rong Hong, Chun-Yan Huang, Jiu Gao","doi":"10.3389/fdgth.2026.1783347","DOIUrl":"https://doi.org/10.3389/fdgth.2026.1783347","url":null,"abstract":"<p><strong>Background: </strong>Large language models (LLMs) have shown growing potential for medical education and assessment, but evidence on their performance in specialty certification exams in China-particularly in ultrasound medicine-remains limited.</p><p><strong>Objective: </strong>To compare the performance of ChatGPT-5 and DeepSeek on the Chinese Ultrasound Medicine Senior Professional Title Examination, overall and by item type.</p><p><strong>Methods: </strong>Between August and September 2025, we randomly selected 100 multiple-choice questions from the official Chinese Ultrasound Medicine Senior Professional Title Examination bank (60 image-based interpretation items and 40 text-based items). We evaluated ChatGPT-5 and DeepSeek using identical prompts through their public web interfaces. The primary outcome was overall accuracy; secondary outcomes were accuracy by item type and subspecialty. Between-model differences were assessed using two-proportion <i>z</i>-tests (<i>α</i> = 0.05) in Python 3.12.</p><p><strong>Results: </strong>Overall accuracy was higher for ChatGPT-5 than for DeepSeek [74.0% (74/100) vs. 60.0% (60/100); <i>p</i> = 0.035]. Accuracy on image-based items was also higher for ChatGPT-5 (61.7% vs. 40.0%; <i>p</i> = 0.018). Performance on text-based items was similar for both models (92.5% vs. 90.0%). Subspecialty patterns varied across domains; however, no between-model differences reached statistical significance.</p><p><strong>Conclusions: </strong>ChatGPT-5 outperformed DeepSeek on image-based items (61.7% vs. 40.0%), while both models performed similarly on text-based knowledge items (92.5% vs. 90.0%). Overall, both LLMs showed strong performance on Chinese ultrasound senior-title examination questions, with complementary strengths across content areas. They may be useful as supplementary educational tools, but further advances in multimodal reasoning are needed to support more reliable image interpretation.</p>","PeriodicalId":73078,"journal":{"name":"Frontiers in digital health","volume":"8 ","pages":"1783347"},"PeriodicalIF":3.2,"publicationDate":"2026-03-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12968994/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147438184","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-03-06eCollection Date: 2026-01-01DOI: 10.3389/fdgth.2026.1761690
Belén Curto, Vidal Moreno, Juan-Alberto García-Esteban, David Sanchez-Poveda, Pablo Alonso, Felipe Zaballos
Introduction: Ultrasonography (US) plays a central role in modern diagnostic and interventional medicine, particularly in the management of facet-origin chronic low back pain, a highly prevalent condition in industrialized societies. However, its clinical effectiveness depends largely on the level of specialist training, requiring advanced skills in probe manipulation, sonoanatomy interpretation, brain-hand-eye coordination, and safe planning of interventional procedures. This work presents the development of a training simulator for ultrasound-guided treatment of lumbar facet syndrome; the simulator is implemented within a modular learning framework designed to support the flexible and efficient creation of procedure-specific simulators.
Methods: The developed simulator integrates a physical replica of an ultrasound probe, enabling trainees to practice realistic handling. Probe movements performed by the trainee along the scan path are continuously tracked and mapped to corresponding ultrasound images and videos, previously acquired by clinical experts from a real subject and displayed in real time on a computer screen. For interventional planning, a virtual syringe-and-needle component allows trainees to simulate needle orientation and insertion depth, with relevant anatomical structures highlighted as visual learning aids.
Results: A validation study was conducted involving 18 final-year medical students using an ad hoc questionnaire addressing usability, realism, learning support, and overall training experience. The results demonstrate a high level of student acceptance and a positive perceived impact on the acquisition of skills related to ultrasound-guided exploration and interventional planning. Most students reported accelerated skill acquisition in US examination (89% very satisfied, 11% satisfied) and high motivation (83% very satisfied, 17% satisfied). Overall performance and the likelihood of recommending the simulator received the highest rating from all participants (100%).
Discussion: From the perspective of students, the simulator provides a realistic and supportive learning experience, particularly due to the realism of the physical probe replica, the quality of the graphical user interface, and the guided learning process. From the perspective of instructors, the effectiveness of the simulator depends on the quality of the learning resources and the scope of the training cases. Although the preparation and curation of high-quality ultrasound datasets and annotations remains time-consuming, the framework significantly facilitates and adds flexibility to the development of new case studies. This positions the approach as a valuable complementary training resource, helping to bridge the gap between theoretical instruction and supervised clinical practice in ultrasound-guided procedures.
{"title":"A professional training simulator for skill acquisition in ultrasound-guided lumbar facet syndrome intervention: design and educational evaluation.","authors":"Belén Curto, Vidal Moreno, Juan-Alberto García-Esteban, David Sanchez-Poveda, Pablo Alonso, Felipe Zaballos","doi":"10.3389/fdgth.2026.1761690","DOIUrl":"https://doi.org/10.3389/fdgth.2026.1761690","url":null,"abstract":"<p><strong>Introduction: </strong>Ultrasonography (US) plays a central role in modern diagnostic and interventional medicine, particularly in the management of facet-origin chronic low back pain, a highly prevalent condition in industrialized societies. However, its clinical effectiveness depends largely on the level of specialist training, requiring advanced skills in probe manipulation, sonoanatomy interpretation, brain-hand-eye coordination, and safe planning of interventional procedures. This work presents the development of a training simulator for ultrasound-guided treatment of lumbar facet syndrome; the simulator is implemented within a modular learning framework designed to support the flexible and efficient creation of procedure-specific simulators.</p><p><strong>Methods: </strong>The developed simulator integrates a physical replica of an ultrasound probe, enabling trainees to practice realistic handling. Probe movements performed by the trainee along the scan path are continuously tracked and mapped to corresponding ultrasound images and videos, previously acquired by clinical experts from a real subject and displayed in real time on a computer screen. For interventional planning, a virtual syringe-and-needle component allows trainees to simulate needle orientation and insertion depth, with relevant anatomical structures highlighted as visual learning aids.</p><p><strong>Results: </strong>A validation study was conducted involving 18 final-year medical students using an ad hoc questionnaire addressing usability, realism, learning support, and overall training experience. The results demonstrate a high level of student acceptance and a positive perceived impact on the acquisition of skills related to ultrasound-guided exploration and interventional planning. Most students reported accelerated skill acquisition in US examination (89% very satisfied, 11% satisfied) and high motivation (83% very satisfied, 17% satisfied). Overall performance and the likelihood of recommending the simulator received the highest rating from all participants (100%).</p><p><strong>Discussion: </strong>From the perspective of students, the simulator provides a realistic and supportive learning experience, particularly due to the realism of the physical probe replica, the quality of the graphical user interface, and the guided learning process. From the perspective of instructors, the effectiveness of the simulator depends on the quality of the learning resources and the scope of the training cases. Although the preparation and curation of high-quality ultrasound datasets and annotations remains time-consuming, the framework significantly facilitates and adds flexibility to the development of new case studies. This positions the approach as a valuable complementary training resource, helping to bridge the gap between theoretical instruction and supervised clinical practice in ultrasound-guided procedures.</p>","PeriodicalId":73078,"journal":{"name":"Frontiers in digital health","volume":"8 ","pages":"1761690"},"PeriodicalIF":3.2,"publicationDate":"2026-03-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC13002623/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147500895","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-03-06eCollection Date: 2026-01-01DOI: 10.3389/fdgth.2026.1772965
Dao-Rong Hong, Chun-Yan Huang, Jiu Gao
Background: Ultrasonography training for residents is challenging owing to its operator-dependent nature and difficulties in mastering subtle image interpretation. Multimodal large language models like ChatGPT-4o enable efficient knowledge retrieval but show marked limitations in static ultrasonography image analysis.
Methods: In this prospective, single-centre randomized controlled trial, 45 first-year ultrasonography residents were randomly allocated to control (traditional resources), AI-only (ChatGPT-4o exclusively), or blended (ChatGPT-4o plus weekly faculty tutorials) groups. After a 3-week intervention, performance was assessed using a 150-item examination (pure-text and image-based multiple-choice questions). The study was approved by the institutional ethics committee, and written informed consent was obtained.
Results: The blended group achieved the highest scores (mean 128.40 ± 18.25) vs. AI-only (119.87 ± 19.11) and control (110.60 ± 20.45; P = 0.02), with superior pure-text performance (P = 0.03) and significant advantages in obstetrics/gynaecology (P = 0.04) and superficial organ ultrasonography (P = 0.047). Examination time was shortest in the blended group (P = 0.03). ChatGPT-4o alone was 85% accurate on text but only 47% on image-based questions.
Conclusions: A faculty-guided AI-integrated strategy was associated with improved short-term post-intervention performance compared with AI-only or traditional learning; however, effects reflect the combined intervention and AI support for static ultrasound image interpretation remains limited.
{"title":"ChatGPT-4o with faculty guidance outperforms AI-only and traditional learning in ultrasonography training: a randomized trial.","authors":"Dao-Rong Hong, Chun-Yan Huang, Jiu Gao","doi":"10.3389/fdgth.2026.1772965","DOIUrl":"https://doi.org/10.3389/fdgth.2026.1772965","url":null,"abstract":"<p><strong>Background: </strong>Ultrasonography training for residents is challenging owing to its operator-dependent nature and difficulties in mastering subtle image interpretation. Multimodal large language models like ChatGPT-4o enable efficient knowledge retrieval but show marked limitations in static ultrasonography image analysis.</p><p><strong>Methods: </strong>In this prospective, single-centre randomized controlled trial, 45 first-year ultrasonography residents were randomly allocated to control (traditional resources), AI-only (ChatGPT-4o exclusively), or blended (ChatGPT-4o plus weekly faculty tutorials) groups. After a 3-week intervention, performance was assessed using a 150-item examination (pure-text and image-based multiple-choice questions). The study was approved by the institutional ethics committee, and written informed consent was obtained.</p><p><strong>Results: </strong>The blended group achieved the highest scores (mean 128.40 ± 18.25) vs. AI-only (119.87 ± 19.11) and control (110.60 ± 20.45; <i>P</i> = 0.02), with superior pure-text performance (<i>P</i> = 0.03) and significant advantages in obstetrics/gynaecology (<i>P</i> = 0.04) and superficial organ ultrasonography (<i>P</i> = 0.047). Examination time was shortest in the blended group (<i>P</i> = 0.03). ChatGPT-4o alone was 85% accurate on text but only 47% on image-based questions.</p><p><strong>Conclusions: </strong>A faculty-guided AI-integrated strategy was associated with improved short-term post-intervention performance compared with AI-only or traditional learning; however, effects reflect the combined intervention and AI support for static ultrasound image interpretation remains limited.</p>","PeriodicalId":73078,"journal":{"name":"Frontiers in digital health","volume":"8 ","pages":"1772965"},"PeriodicalIF":3.2,"publicationDate":"2026-03-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12969794/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147438231","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-03-05eCollection Date: 2026-01-01DOI: 10.3389/fdgth.2026.1697787
Massimo Micocci, Omar Butt, ShanShan Zhou, Austen El-Osta, Peter Buckle, George B Hanna
Background: Hypertension remains a major health burden in the UK, contributing significantly to cardiovascular disease and health inequalities. Although digital health technologies offer opportunities to enhance hypertension management, current NHS pathways face challenges, including inefficiencies in patient monitoring, limited patient engagement, and resource constraints. This study aimed to evaluate integration challenges of remote digital monitoring tools for blood pressure into NHS hypertension care pathways.
Methods: An exploratory study combining semi-structured interviews with 14 primary care NHS stakeholders recruited from across England, and a field study at two GP practices. Participants were selected to have either experience or not with digital platforms for remote monitoring of chronic conditions in primary care. Clinical pathway mapping and gap analysis were used to identify inefficiencies in hypertension management and explore digital platforms' potential integration.
Results: Eight major gaps were identified, including inconsistent patient engagement, lack of automated identification of at-risk and non-compliant patients, limited access to home monitors, and health inequalities related to digital literacy. Integration of a digital platform addressed several of these gaps by promoting self-monitoring behaviours, improving resource allocation through risk stratification, and enhancing decision-making with continuous patient data. However, barriers such as interoperability issues, workload concerns, literacy disparities, and unclear role responsibilities were noted.
Conclusion: Successful implementation requires addressing systemic challenges through targeted training, robust interoperability standards, clearer task allocation, and equity-focused interventions to bridge the digital divide. A human-centred, system-wide strategy is essential to ensure sustainable adoption and maximise the impact of digital innovations in primary care.
{"title":"Integrating remote blood pressure monitoring into NHS primary care: a human factors perspective.","authors":"Massimo Micocci, Omar Butt, ShanShan Zhou, Austen El-Osta, Peter Buckle, George B Hanna","doi":"10.3389/fdgth.2026.1697787","DOIUrl":"https://doi.org/10.3389/fdgth.2026.1697787","url":null,"abstract":"<p><strong>Background: </strong>Hypertension remains a major health burden in the UK, contributing significantly to cardiovascular disease and health inequalities. Although digital health technologies offer opportunities to enhance hypertension management, current NHS pathways face challenges, including inefficiencies in patient monitoring, limited patient engagement, and resource constraints. This study aimed to evaluate integration challenges of remote digital monitoring tools for blood pressure into NHS hypertension care pathways.</p><p><strong>Methods: </strong>An exploratory study combining semi-structured interviews with 14 primary care NHS stakeholders recruited from across England, and a field study at two GP practices. Participants were selected to have either experience or not with digital platforms for remote monitoring of chronic conditions in primary care. Clinical pathway mapping and gap analysis were used to identify inefficiencies in hypertension management and explore digital platforms' potential integration.</p><p><strong>Results: </strong>Eight major gaps were identified, including inconsistent patient engagement, lack of automated identification of at-risk and non-compliant patients, limited access to home monitors, and health inequalities related to digital literacy. Integration of a digital platform addressed several of these gaps by promoting self-monitoring behaviours, improving resource allocation through risk stratification, and enhancing decision-making with continuous patient data. However, barriers such as interoperability issues, workload concerns, literacy disparities, and unclear role responsibilities were noted.</p><p><strong>Conclusion: </strong>Successful implementation requires addressing systemic challenges through targeted training, robust interoperability standards, clearer task allocation, and equity-focused interventions to bridge the digital divide. A human-centred, system-wide strategy is essential to ensure sustainable adoption and maximise the impact of digital innovations in primary care.</p>","PeriodicalId":73078,"journal":{"name":"Frontiers in digital health","volume":"8 ","pages":"1697787"},"PeriodicalIF":3.2,"publicationDate":"2026-03-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12999572/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147500490","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-03-05eCollection Date: 2026-01-01DOI: 10.3389/fdgth.2026.1778627
Ming Yu, Rong Yu, Mengjia Zhou, Xiaoli Fan, Ronghui Geng, Jing Ji, Suping Cai, Lili JIang, Lingling Jiang
Aim: To investigate the current situation of clinical nurses' attitudes towards artificial intelligence in county hospitals and analyze its influencing factors, so as to provide a reference for promoting the application of artificial intelligence technology in the field of primary medical care.
Design: A descriptive, cross-sectional study.
Methods: A total of 449 clinical nurses from a Chinese county-level B-level hospital in Nantong City were selected from August to September 2025 by convenience sampling, and the general information questionnaire, the Attitude Scale for the Application of Artificial Intelligence Technology in Nursing, the Artificial Intelligence Literacy Scale and the Change Fatigue Scale were used to investigate the influencing factors.
Results: The total score of clinical nurses' attitudes toward AI was 45.17 ± 2.38, indicating a moderate level. Multiple linear regression analysis identified age, participation in AI-related training, education level, number of monthly night shifts, change fatigue, and total AI literacy score as significant determinants of AI attitudes (all P < 0.05). Collectively, these factors accounted for 60.6% of the total variance in AI attitude scores.
Conclusion: The attitude of Chinese county-level clinical nurses towards AI is at a moderate level and is influenced by multiple modifiable factors. To enhance AI acceptance and facilitate its integration into primary care, we recommend implementing targeted AI training programs, improving AI literacy, optimizing scheduling to reduce night shift burdens, and proactively managing change fatigue.
{"title":"A cross-sectional analysis of AI readiness and attitudes among nurses in resource-limited Chinese county hospitals.","authors":"Ming Yu, Rong Yu, Mengjia Zhou, Xiaoli Fan, Ronghui Geng, Jing Ji, Suping Cai, Lili JIang, Lingling Jiang","doi":"10.3389/fdgth.2026.1778627","DOIUrl":"https://doi.org/10.3389/fdgth.2026.1778627","url":null,"abstract":"<p><strong>Aim: </strong>To investigate the current situation of clinical nurses' attitudes towards artificial intelligence in county hospitals and analyze its influencing factors, so as to provide a reference for promoting the application of artificial intelligence technology in the field of primary medical care.</p><p><strong>Design: </strong>A descriptive, cross-sectional study.</p><p><strong>Methods: </strong>A total of 449 clinical nurses from a Chinese county-level B-level hospital in Nantong City were selected from August to September 2025 by convenience sampling, and the general information questionnaire, the Attitude Scale for the Application of Artificial Intelligence Technology in Nursing, the Artificial Intelligence Literacy Scale and the Change Fatigue Scale were used to investigate the influencing factors.</p><p><strong>Results: </strong>The total score of clinical nurses' attitudes toward AI was 45.17 ± 2.38, indicating a moderate level. Multiple linear regression analysis identified age, participation in AI-related training, education level, number of monthly night shifts, change fatigue, and total AI literacy score as significant determinants of AI attitudes (all <i>P</i> < 0.05). Collectively, these factors accounted for 60.6% of the total variance in AI attitude scores.</p><p><strong>Conclusion: </strong>The attitude of Chinese county-level clinical nurses towards AI is at a moderate level and is influenced by multiple modifiable factors. To enhance AI acceptance and facilitate its integration into primary care, we recommend implementing targeted AI training programs, improving AI literacy, optimizing scheduling to reduce night shift burdens, and proactively managing change fatigue.</p>","PeriodicalId":73078,"journal":{"name":"Frontiers in digital health","volume":"8 ","pages":"1778627"},"PeriodicalIF":3.2,"publicationDate":"2026-03-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12999563/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147500903","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-03-05eCollection Date: 2026-01-01DOI: 10.3389/fdgth.2026.1710829
Mahreen Kiran, Ying Xie, Graham Ball, Rudolph Schutte, Nasreen Anjum, Barbara Pierscionek
<p><strong>Introduction: </strong>Type 2 Diabetes Mellitus (T2DM) is a rising global health concern, heavily influenced by modifiable lifestyle and psychosocial factors. However, most predictive tools focus on biomedical markers and rely on real-time data from wearables or electronic health records, limiting their scalability in resource-constrained settings. This study presents a novel digital twin (DT) framework that uses retrospective lifestyle, behavioral, and psychosocial data to forecast T2DM onset and simulate the estimated effects of preventive interventions.</p><p><strong>Methods: </strong>Data were drawn from 19,774 participants in the UK Biobank cohort, followed for up to 17 years. A penalized Cox proportional hazards model was employed to estimate individual time-to-event risk trajectories based on 90 candidate predictors. Predictors were selected through univariate screening, multicollinearity assessment, and variance filtering, yielding a final model with 14 significant variables. Causal inference techniques, including directed acyclic graphs (DAGs) and counterfactual simulations, were used to explore intervention effects on disease progression.</p><p><strong>Results: </strong>The model demonstrated strong predictive performance (C-index <math><mo>=</mo></math> 0.90, SD <math><mo>=</mo></math> 0.004). Psychosocial stressors such as loneliness, insomnia, and poor mental health emerged as strong independent predictors and were associated with estimated increases in absolute T2DM risk of approximately 35 percentage points individually and nearly 78 percentage points when combined, under the modeled assumptions. These effects were partly reinforced through diet, with high intake of processed meat, salt, and sugary cereals acting as risk amplifiers within the modeled causal pathways. Cheese intake was protective overall, but its estimated benefit was attenuated under psychosocial stress, where reduced consumption produced a small, directionally harmful mediation effect. Counterfactual simulations suggested that improvements in psychosocial conditions could reduce estimated T2DM risk by approximately 11.6 percentage points within the modeled cohort, with protective dietary patterns such as cheese consumption re-emerging as psychosocial stress was alleviated. The model also revealed pronounced ethnic disparities, with South Asian, African, and Caribbean participants exhibiting significantly higher estimated risk than White counterparts within this cohort. These findings highlight the potential of integrated, stress-informed prevention strategies that address both psychosocial and dietary pathways.</p><p><strong>Conclusion: </strong>This study introduces a transparent, simulation-enabled DT framework for estimating T2DM risk and exploring behavioral intervention scenarios without reliance on real-time data streams. It enables interpretable, personalized prevention planning and supports exploration of scalable deployment in public health, pa
{"title":"A digital twin framework for predicting and simulating type 2 diabetes onset using retrospective lifestyle data.","authors":"Mahreen Kiran, Ying Xie, Graham Ball, Rudolph Schutte, Nasreen Anjum, Barbara Pierscionek","doi":"10.3389/fdgth.2026.1710829","DOIUrl":"https://doi.org/10.3389/fdgth.2026.1710829","url":null,"abstract":"<p><strong>Introduction: </strong>Type 2 Diabetes Mellitus (T2DM) is a rising global health concern, heavily influenced by modifiable lifestyle and psychosocial factors. However, most predictive tools focus on biomedical markers and rely on real-time data from wearables or electronic health records, limiting their scalability in resource-constrained settings. This study presents a novel digital twin (DT) framework that uses retrospective lifestyle, behavioral, and psychosocial data to forecast T2DM onset and simulate the estimated effects of preventive interventions.</p><p><strong>Methods: </strong>Data were drawn from 19,774 participants in the UK Biobank cohort, followed for up to 17 years. A penalized Cox proportional hazards model was employed to estimate individual time-to-event risk trajectories based on 90 candidate predictors. Predictors were selected through univariate screening, multicollinearity assessment, and variance filtering, yielding a final model with 14 significant variables. Causal inference techniques, including directed acyclic graphs (DAGs) and counterfactual simulations, were used to explore intervention effects on disease progression.</p><p><strong>Results: </strong>The model demonstrated strong predictive performance (C-index <math><mo>=</mo></math> 0.90, SD <math><mo>=</mo></math> 0.004). Psychosocial stressors such as loneliness, insomnia, and poor mental health emerged as strong independent predictors and were associated with estimated increases in absolute T2DM risk of approximately 35 percentage points individually and nearly 78 percentage points when combined, under the modeled assumptions. These effects were partly reinforced through diet, with high intake of processed meat, salt, and sugary cereals acting as risk amplifiers within the modeled causal pathways. Cheese intake was protective overall, but its estimated benefit was attenuated under psychosocial stress, where reduced consumption produced a small, directionally harmful mediation effect. Counterfactual simulations suggested that improvements in psychosocial conditions could reduce estimated T2DM risk by approximately 11.6 percentage points within the modeled cohort, with protective dietary patterns such as cheese consumption re-emerging as psychosocial stress was alleviated. The model also revealed pronounced ethnic disparities, with South Asian, African, and Caribbean participants exhibiting significantly higher estimated risk than White counterparts within this cohort. These findings highlight the potential of integrated, stress-informed prevention strategies that address both psychosocial and dietary pathways.</p><p><strong>Conclusion: </strong>This study introduces a transparent, simulation-enabled DT framework for estimating T2DM risk and exploring behavioral intervention scenarios without reliance on real-time data streams. It enables interpretable, personalized prevention planning and supports exploration of scalable deployment in public health, pa","PeriodicalId":73078,"journal":{"name":"Frontiers in digital health","volume":"8 ","pages":"1710829"},"PeriodicalIF":3.2,"publicationDate":"2026-03-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12999582/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147500906","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-03-05eCollection Date: 2026-01-01DOI: 10.3389/fdgth.2026.1754061
Gábor Speer
{"title":"The repaired man or the man with extras: medical human-cyborgs.","authors":"Gábor Speer","doi":"10.3389/fdgth.2026.1754061","DOIUrl":"https://doi.org/10.3389/fdgth.2026.1754061","url":null,"abstract":"","PeriodicalId":73078,"journal":{"name":"Frontiers in digital health","volume":"8 ","pages":"1754061"},"PeriodicalIF":3.2,"publicationDate":"2026-03-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12999961/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147500463","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-03-05eCollection Date: 2026-01-01DOI: 10.3389/fdgth.2026.1733713
Agnes Kyamulabi, Abdul-Fatawu Abdulai
Background: The use of digital health technologies to access information and services related to sexual and reproductive health has been increasing. Despite the usefulness of these technologies, there are emerging concerns that they could inadvertently trigger, perpetuate and exacerbate trauma among patients. The purpose of this study was to explore trauma-informed care principles that could be applied in designing and/or utilizing sexual and reproductive health services.
Method: We conducted 5 focus group discussions with participants who have used digital health technologies to access sexual and reproductive health services in Western Canada. The discussion centred on ways sexual health-related digital technologies could prevent triggering or perpetuating trauma among patients. The discussion took place over Zoom, and the data were analyzed using a thematic analysis approach.
Results: The study revealed five main considerations that could be adopted in the design and use of sexual and reproductive health technologies to prevent the unintended consequences of trauma. These include (1) integrating accessibility and inclusivity features; (2) integrating confidentiality, safety, and privacy features like quick exit buttons; (3) using empathetic language and terminologies; (4) integrating emotional and psychological support services; and (5) implementing aesthetic design features.
Conclusion: The findings of this study would help produce equitable, safe, and empowering digital health technologies for all users, particularly trauma survivors. By integrating these principles, developers and healthcare providers can create tools that reduce barriers, mitigate re-traumatization risks, and promote positive health outcomes. Future research should focus on evaluating the implementation and impact of trauma-informed digital tools in diverse settings.
{"title":"\"<i>I need to feel safe before I can engage</i>\": embedding trauma-informed principles in sexual and reproductive health digital technologies.","authors":"Agnes Kyamulabi, Abdul-Fatawu Abdulai","doi":"10.3389/fdgth.2026.1733713","DOIUrl":"https://doi.org/10.3389/fdgth.2026.1733713","url":null,"abstract":"<p><strong>Background: </strong>The use of digital health technologies to access information and services related to sexual and reproductive health has been increasing. Despite the usefulness of these technologies, there are emerging concerns that they could inadvertently trigger, perpetuate and exacerbate trauma among patients. The purpose of this study was to explore trauma-informed care principles that could be applied in designing and/or utilizing sexual and reproductive health services.</p><p><strong>Method: </strong>We conducted 5 focus group discussions with participants who have used digital health technologies to access sexual and reproductive health services in Western Canada. The discussion centred on ways sexual health-related digital technologies could prevent triggering or perpetuating trauma among patients. The discussion took place over Zoom, and the data were analyzed using a thematic analysis approach.</p><p><strong>Results: </strong>The study revealed five main considerations that could be adopted in the design and use of sexual and reproductive health technologies to prevent the unintended consequences of trauma. These include (1) integrating accessibility and inclusivity features; (2) integrating confidentiality, safety, and privacy features like quick exit buttons; (3) using empathetic language and terminologies; (4) integrating emotional and psychological support services; and (5) implementing aesthetic design features.</p><p><strong>Conclusion: </strong>The findings of this study would help produce equitable, safe, and empowering digital health technologies for all users, particularly trauma survivors. By integrating these principles, developers and healthcare providers can create tools that reduce barriers, mitigate re-traumatization risks, and promote positive health outcomes. Future research should focus on evaluating the implementation and impact of trauma-informed digital tools in diverse settings.</p>","PeriodicalId":73078,"journal":{"name":"Frontiers in digital health","volume":"8 ","pages":"1733713"},"PeriodicalIF":3.2,"publicationDate":"2026-03-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12999910/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147500926","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Purpose: To evaluate three large language models (LLMs), including ChatGPT 5, ChatGPT 4o, and ChatGPT 3.5, in automating TNM staging from PET-CT reports across six cancer types, and to assess their clinical utility compared with junior radiologists.
Materials and methods: PET-CT reports from 552 treatment-naive patients in two institutions with confirmed primary malignancies (lung, breast, liver, pancreatic, renal, and prostate cancer) were analyzed. Three ChatGPT-series LLMs and five junior radiologists independently performed TNM staging. Reference standards were established by two senior radiologists according to the 8th version of American Joint Committee on Cancer (AJCC) staging system. Performance was evaluated using accuracy rates. Intra-model agreement was assessed by repeating each model three times per report with identical prompts, and inter-model agreement was evaluated using Cohen's κ coefficients.
Results: ChatGPT 5 achieved the highest overall accuracy (82.1%, 453/552), followed by ChatGPT 4o (74.3%, 410/552), both significantly outperforming ChatGPT 3.5 (59.6%, 329/552) and junior radiologists (77.0%, 425/552; p = 0.041 for ChatGPT 5 vs. junior radiologists). Accuracy varied by cancer type, with the highest performance in lung cancer staging (88.5%) and the lowest in pancreatic cancer (69.2%). Across TNM categories, all models achieved the best performance in T staging, followed by N staging, with M staging remaining the most challenging. ChatGPT 5 showed near-perfect intra-model agreement (κ = 0.96), while inter-model agreement ranged from moderate between ChatGPT 3.5 and 4o (κ = 0.58) to substantial between ChatGPT 5 and 4o (κ = 0.78). ChatGPT 5 processed cases markedly faster than junior radiologists (8.3 ± 3.2 vs. 92.5 ± 21.7 s per case; p < 0.001).
Conclusion: Among the three LLMs, ChatGPT 5 demonstrated the highest accuracy, stability, and efficiency in automated TNM staging from PET-CT reports, achieving performance comparable to or slightly exceeding junior radiologists. Its advantages in T staging and lung cancer evaluation highlight its clinical utility as a potential decision-support tool.
{"title":"Evaluating large language models for automated TNM staging from PET-CT reports: a multi-cancer comparative study.","authors":"Wen Xu, Lixiu Cao, Qijun Shen, Yanna Shan, Shushu Pan, Mei Ruan","doi":"10.3389/fdgth.2026.1741973","DOIUrl":"https://doi.org/10.3389/fdgth.2026.1741973","url":null,"abstract":"<p><strong>Purpose: </strong>To evaluate three large language models (LLMs), including ChatGPT 5, ChatGPT 4o, and ChatGPT 3.5, in automating TNM staging from PET-CT reports across six cancer types, and to assess their clinical utility compared with junior radiologists.</p><p><strong>Materials and methods: </strong>PET-CT reports from 552 treatment-naive patients in two institutions with confirmed primary malignancies (lung, breast, liver, pancreatic, renal, and prostate cancer) were analyzed. Three ChatGPT-series LLMs and five junior radiologists independently performed TNM staging. Reference standards were established by two senior radiologists according to the 8th version of American Joint Committee on Cancer (AJCC) staging system. Performance was evaluated using accuracy rates. Intra-model agreement was assessed by repeating each model three times per report with identical prompts, and inter-model agreement was evaluated using Cohen's <i>κ</i> coefficients.</p><p><strong>Results: </strong>ChatGPT 5 achieved the highest overall accuracy (82.1%, 453/552), followed by ChatGPT 4o (74.3%, 410/552), both significantly outperforming ChatGPT 3.5 (59.6%, 329/552) and junior radiologists (77.0%, 425/552; <i>p</i> = 0.041 for ChatGPT 5 vs. junior radiologists). Accuracy varied by cancer type, with the highest performance in lung cancer staging (88.5%) and the lowest in pancreatic cancer (69.2%). Across TNM categories, all models achieved the best performance in T staging, followed by N staging, with M staging remaining the most challenging. ChatGPT 5 showed near-perfect intra-model agreement (<i>κ</i> = 0.96), while inter-model agreement ranged from moderate between ChatGPT 3.5 and 4o (<i>κ</i> = 0.58) to substantial between ChatGPT 5 and 4o (<i>κ</i> = 0.78). ChatGPT 5 processed cases markedly faster than junior radiologists (8.3 ± 3.2 vs. 92.5 ± 21.7 s per case; <i>p</i> < 0.001).</p><p><strong>Conclusion: </strong>Among the three LLMs, ChatGPT 5 demonstrated the highest accuracy, stability, and efficiency in automated TNM staging from PET-CT reports, achieving performance comparable to or slightly exceeding junior radiologists. Its advantages in T staging and lung cancer evaluation highlight its clinical utility as a potential decision-support tool.</p>","PeriodicalId":73078,"journal":{"name":"Frontiers in digital health","volume":"8 ","pages":"1741973"},"PeriodicalIF":3.2,"publicationDate":"2026-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12996206/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147488606","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-03-04eCollection Date: 2026-01-01DOI: 10.3389/fdgth.2026.1806851
Daniel B Hier, Michael D Carrithers, Jorge M Rodríguez-Fernández, Benjamin Kummer
{"title":"Editorial: The digitalization of neurology-volume II.","authors":"Daniel B Hier, Michael D Carrithers, Jorge M Rodríguez-Fernández, Benjamin Kummer","doi":"10.3389/fdgth.2026.1806851","DOIUrl":"https://doi.org/10.3389/fdgth.2026.1806851","url":null,"abstract":"","PeriodicalId":73078,"journal":{"name":"Frontiers in digital health","volume":"8 ","pages":"1806851"},"PeriodicalIF":3.2,"publicationDate":"2026-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12996112/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147488688","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}