Stacking model framework reveals clinical biochemical data and dietary behavior features associated with type 2 diabetes: A retrospective cohort study.
Yong Fu, Xinghuan Liang, Xi Yang, Li Li, Liheng Meng, Yuekun Wei, Daizheng Huang, Yingfen Qin
{"title":"Stacking model framework reveals clinical biochemical data and dietary behavior features associated with type 2 diabetes: A retrospective cohort study.","authors":"Yong Fu, Xinghuan Liang, Xi Yang, Li Li, Liheng Meng, Yuekun Wei, Daizheng Huang, Yingfen Qin","doi":"10.1063/5.0207658","DOIUrl":null,"url":null,"abstract":"<p><p><b>Background:</b> Type 2 diabetes mellitus (T2DM) is the most common type of diabetes, accounting for around 90% of all diabetes. Studies have found that dietary habits and biochemical metabolic changes are closely related to T2DM disease surveillance, but early surveillance tools are not specific and have lower accuracy. This paper aimed to provide a reliable artificial intelligence model with high accuracy for the clinical diagnosis of T2DM. <b>Methods:</b> A cross-sectional dataset comprising 8981 individuals from the First Affiliated Hospital of Guangxi Medical University was analyzed by a model fusion framework. The model includes four machine learning (ML) models, which used the stacking method. The ability to leverage the strengths of different algorithms to capture complex patterns in the data can effectively combine questionnaire data and blood test data to predict diabetes. <b>Results:</b> The experimental results show that the stacking model achieves significant prediction results in diabetes detection. Compared with the single machine learning algorithm, the stacking model has improved in the metrics of accuracy, recall, and F1-score. The test set accuracy is 0.90, and the precision, recall, F1-score, area under the curve, and average precision (AP) are 0.91, 0.90, 0.90, 0.90, and 0.85, respectively. Additionally, this study showed that HbA1c (P < 0.001,OR = 2.203), fasting blood glucose (FBG) (P < 0.001,OR = 1.586), Ph2BG (P < 0.001,OR = 1.190), age (P < 0.001,OR = 1.018), Han nationality (P < 0.001,OR = 1.484), and carbonate beverages (P = 0.001,OR = 1.347) were important predictors of T2DM. <b>Conclusion:</b> This study demonstrates that stacking models show great potential in diabetes detection, and by integrating multiple machine learning algorithms, stacking models can significantly improve the accuracy and stability of diabetes prediction and provide strong support for disease prevention, early diagnosis, and individualized treatment.</p>","PeriodicalId":46288,"journal":{"name":"APL Bioengineering","volume":"8 4","pages":"046111"},"PeriodicalIF":6.6000,"publicationDate":"2024-11-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11584240/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"APL Bioengineering","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1063/5.0207658","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/12/1 0:00:00","PubModel":"eCollection","JCR":"Q1","JCRName":"ENGINEERING, BIOMEDICAL","Score":null,"Total":0}
引用次数: 0
Abstract
Background: Type 2 diabetes mellitus (T2DM) is the most common type of diabetes, accounting for around 90% of all diabetes. Studies have found that dietary habits and biochemical metabolic changes are closely related to T2DM disease surveillance, but early surveillance tools are not specific and have lower accuracy. This paper aimed to provide a reliable artificial intelligence model with high accuracy for the clinical diagnosis of T2DM. Methods: A cross-sectional dataset comprising 8981 individuals from the First Affiliated Hospital of Guangxi Medical University was analyzed by a model fusion framework. The model includes four machine learning (ML) models, which used the stacking method. The ability to leverage the strengths of different algorithms to capture complex patterns in the data can effectively combine questionnaire data and blood test data to predict diabetes. Results: The experimental results show that the stacking model achieves significant prediction results in diabetes detection. Compared with the single machine learning algorithm, the stacking model has improved in the metrics of accuracy, recall, and F1-score. The test set accuracy is 0.90, and the precision, recall, F1-score, area under the curve, and average precision (AP) are 0.91, 0.90, 0.90, 0.90, and 0.85, respectively. Additionally, this study showed that HbA1c (P < 0.001,OR = 2.203), fasting blood glucose (FBG) (P < 0.001,OR = 1.586), Ph2BG (P < 0.001,OR = 1.190), age (P < 0.001,OR = 1.018), Han nationality (P < 0.001,OR = 1.484), and carbonate beverages (P = 0.001,OR = 1.347) were important predictors of T2DM. Conclusion: This study demonstrates that stacking models show great potential in diabetes detection, and by integrating multiple machine learning algorithms, stacking models can significantly improve the accuracy and stability of diabetes prediction and provide strong support for disease prevention, early diagnosis, and individualized treatment.
期刊介绍:
APL Bioengineering is devoted to research at the intersection of biology, physics, and engineering. The journal publishes high-impact manuscripts specific to the understanding and advancement of physics and engineering of biological systems. APL Bioengineering is the new home for the bioengineering and biomedical research communities.
APL Bioengineering publishes original research articles, reviews, and perspectives. Topical coverage includes:
-Biofabrication and Bioprinting
-Biomedical Materials, Sensors, and Imaging
-Engineered Living Systems
-Cell and Tissue Engineering
-Regenerative Medicine
-Molecular, Cell, and Tissue Biomechanics
-Systems Biology and Computational Biology