A deep attention-based encoder for the prediction of type 2 diabetes longitudinal outcomes from routinely collected health care data

IF 7.5 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Expert Systems with Applications Pub Date : 2025-02-20 DOI:10.1016/j.eswa.2025.126876

Enrico Manzini , Bogdan Vlacho , Josep Franch-Nadal , Joan Escudero , Ana Génova , Elisenda Reixach , Erich Andrés , Israel Pizarro , Dídac Mauricio , Alexandre Perera-Lluna

{"title":"A deep attention-based encoder for the prediction of type 2 diabetes longitudinal outcomes from routinely collected health care data","authors":"Enrico Manzini , Bogdan Vlacho , Josep Franch-Nadal , Joan Escudero , Ana Génova , Elisenda Reixach , Erich Andrés , Israel Pizarro , Dídac Mauricio , Alexandre Perera-Lluna","doi":"10.1016/j.eswa.2025.126876","DOIUrl":null,"url":null,"abstract":"<div><div>Recent evidence indicates that Type 2 Diabetes Mellitus (T2DM) is a complex and highly heterogeneous disease involving various pathophysiological and genetic pathways, which presents clinicians with challenges in disease management. While deep learning models have made significant progress in helping practitioners manage T2DM treatments, several important limitations persist. In this paper we propose DARE, a model based on the transformer encoder, designed for analyzing longitudinal heterogeneous diabetes data. The model can be easily fine-tuned for various clinical prediction tasks, enabling a computational approach to assist clinicians in the management of the disease. We trained DARE using data from over 200,000 diabetic subjects from the primary healthcare SIDIAP database, which includes diagnosis and drug codes, along with various clinical and analytical measurements. After an unsupervised pre-training phase, we fine-tuned the model for predicting three specific clinical outcomes: i) occurrence of comorbidity, ii) achievement of target glycemic control (defined as glycated hemoglobin <span><math><mrow><mo><</mo><mn>7</mn><mtext>%</mtext></mrow></math></span>) and iii) changes in glucose-lowering treatment. In cross-validation, the embedding vectors generated by DARE outperformed those from baseline models (comorbidities prediction task <span><math><mrow><mi>A</mi><mi>U</mi><mi>C</mi><mo>=</mo><mn>0</mn><mo>.</mo><mn>88</mn></mrow></math></span>, treatment prediction task <span><math><mrow><mi>A</mi><mi>U</mi><mi>C</mi><mo>=</mo><mn>0</mn><mo>.</mo><mn>91</mn></mrow></math></span>, HbA1c target prediction task <span><math><mrow><mi>A</mi><mi>U</mi><mi>C</mi><mo>=</mo><mn>0</mn><mo>.</mo><mn>82</mn></mrow></math></span>). Our findings suggest that attention-based encoders improve results with respect to different deep learning and classical baseline models when used to predict different clinical relevant outcomes from T2DM longitudinal data.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"274 ","pages":"Article 126876"},"PeriodicalIF":7.5000,"publicationDate":"2025-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Expert Systems with Applications","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0957417425004981","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Recent evidence indicates that Type 2 Diabetes Mellitus (T2DM) is a complex and highly heterogeneous disease involving various pathophysiological and genetic pathways, which presents clinicians with challenges in disease management. While deep learning models have made significant progress in helping practitioners manage T2DM treatments, several important limitations persist. In this paper we propose DARE, a model based on the transformer encoder, designed for analyzing longitudinal heterogeneous diabetes data. The model can be easily fine-tuned for various clinical prediction tasks, enabling a computational approach to assist clinicians in the management of the disease. We trained DARE using data from over 200,000 diabetic subjects from the primary healthcare SIDIAP database, which includes diagnosis and drug codes, along with various clinical and analytical measurements. After an unsupervised pre-training phase, we fine-tuned the model for predicting three specific clinical outcomes: i) occurrence of comorbidity, ii) achievement of target glycemic control (defined as glycated hemoglobin

< 7 %

) and iii) changes in glucose-lowering treatment. In cross-validation, the embedding vectors generated by DARE outperformed those from baseline models (comorbidities prediction task

A U C = 0.88

, treatment prediction task

A U C = 0.91

, HbA1c target prediction task

A U C = 0.82

). Our findings suggest that attention-based encoders improve results with respect to different deep learning and classical baseline models when used to predict different clinical relevant outcomes from T2DM longitudinal data.

Abstract Image

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

求助全文

约1分钟内获得全文去求助

来源期刊

Expert Systems with Applications 工程技术-工程：电子与电气

CiteScore

13.80

自引率

10.60%

发文量

2045

审稿时长

8.7 months

期刊介绍： Expert Systems With Applications is an international journal dedicated to the exchange of information on expert and intelligent systems used globally in industry, government, and universities. The journal emphasizes original papers covering the design, development, testing, implementation, and management of these systems, offering practical guidelines. It spans various sectors such as finance, engineering, marketing, law, project management, information management, medicine, and more. The journal also welcomes papers on multi-agent systems, knowledge management, neural networks, knowledge discovery, data mining, and other related areas, excluding applications to military/defense systems.