通过整合代谢组学和基于树的提升方法加强 2 型糖尿病的预测。

IF 3.9 2区 医学 Q2 ENDOCRINOLOGY & METABOLISM Frontiers in Endocrinology Pub Date : 2024-11-11 eCollection Date: 2024-01-01 DOI:10.3389/fendo.2024.1444282
Ahmet Kadir Arslan, Fatma Hilal Yagin, Abdulmohsen Algarni, Erol Karaaslan, Fahaid Al-Hashem, Luca Paolo Ardigò
{"title":"通过整合代谢组学和基于树的提升方法加强 2 型糖尿病的预测。","authors":"Ahmet Kadir Arslan, Fatma Hilal Yagin, Abdulmohsen Algarni, Erol Karaaslan, Fahaid Al-Hashem, Luca Paolo Ardigò","doi":"10.3389/fendo.2024.1444282","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Type 2 diabetes mellitus (T2DM) is a global health problem characterized by insulin resistance and hyperglycemia. Early detection and accurate prediction of T2DM is crucial for effective management and prevention. This study explores the integration of machine learning (ML) and explainable artificial intelligence (XAI) approaches based on metabolomics panel data to identify biomarkers and develop predictive models for T2DM.</p><p><strong>Methods: </strong>Metabolomics data from T2DM (n = 31) and healthy controls (n = 34) were analyzed for biomarker discovery (mostly amino acids, fatty acids, and purines) and T2DM prediction. Feature selection was performed using the least absolute shrinkage and selection operator (LASSO) regression to enhance the model's accuracy and interpretability. Advanced three tree-based ML algorithms (KTBoost: Kernel-Tree Boosting; XGBoost: eXtreme Gradient Boosting; NGBoost: Natural Gradient Boosting) were employed to predict T2DM using these biomarkers. The SHapley Additive exPlanations (SHAP) method was used to explain the effects of metabolomics biomarkers on the prediction of the model.</p><p><strong>Results: </strong>The study identified multiple metabolites associated with T2DM, where LASSO feature selection highlighted important biomarkers. KTBoost [Accuracy: 0.938; CI: (0.880-0.997), Sensitivity: 0.971; CI: (0.847-0.999), Area under the Curve (AUC): 0.965; CI: (0.937-0.994)] demonstrated its effectiveness in using complex metabolomics data for T2DM prediction and achieved better performance than other models. According to KTBoost's SHAP, high levels of phenylactate (pla) and taurine metabolites, as well as low concentrations of cysteine, laspartate, and lcysteate, are strongly associated with the presence of T2DM.</p><p><strong>Conclusion: </strong>The integration of metabolomics profiling and XAI offers a promising approach to predicting T2DM. The use of tree-based algorithms, in particular KTBoost, provides a robust framework for analyzing complex datasets and improves the prediction accuracy of T2DM onset. Future research should focus on validating these biomarkers and models in larger, more diverse populations to solidify their clinical utility.</p>","PeriodicalId":12447,"journal":{"name":"Frontiers in Endocrinology","volume":"15 ","pages":"1444282"},"PeriodicalIF":3.9000,"publicationDate":"2024-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11586166/pdf/","citationCount":"0","resultStr":"{\"title\":\"Enhancing type 2 diabetes mellitus prediction by integrating metabolomics and tree-based boosting approaches.\",\"authors\":\"Ahmet Kadir Arslan, Fatma Hilal Yagin, Abdulmohsen Algarni, Erol Karaaslan, Fahaid Al-Hashem, Luca Paolo Ardigò\",\"doi\":\"10.3389/fendo.2024.1444282\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background: </strong>Type 2 diabetes mellitus (T2DM) is a global health problem characterized by insulin resistance and hyperglycemia. Early detection and accurate prediction of T2DM is crucial for effective management and prevention. This study explores the integration of machine learning (ML) and explainable artificial intelligence (XAI) approaches based on metabolomics panel data to identify biomarkers and develop predictive models for T2DM.</p><p><strong>Methods: </strong>Metabolomics data from T2DM (n = 31) and healthy controls (n = 34) were analyzed for biomarker discovery (mostly amino acids, fatty acids, and purines) and T2DM prediction. Feature selection was performed using the least absolute shrinkage and selection operator (LASSO) regression to enhance the model's accuracy and interpretability. Advanced three tree-based ML algorithms (KTBoost: Kernel-Tree Boosting; XGBoost: eXtreme Gradient Boosting; NGBoost: Natural Gradient Boosting) were employed to predict T2DM using these biomarkers. The SHapley Additive exPlanations (SHAP) method was used to explain the effects of metabolomics biomarkers on the prediction of the model.</p><p><strong>Results: </strong>The study identified multiple metabolites associated with T2DM, where LASSO feature selection highlighted important biomarkers. KTBoost [Accuracy: 0.938; CI: (0.880-0.997), Sensitivity: 0.971; CI: (0.847-0.999), Area under the Curve (AUC): 0.965; CI: (0.937-0.994)] demonstrated its effectiveness in using complex metabolomics data for T2DM prediction and achieved better performance than other models. According to KTBoost's SHAP, high levels of phenylactate (pla) and taurine metabolites, as well as low concentrations of cysteine, laspartate, and lcysteate, are strongly associated with the presence of T2DM.</p><p><strong>Conclusion: </strong>The integration of metabolomics profiling and XAI offers a promising approach to predicting T2DM. The use of tree-based algorithms, in particular KTBoost, provides a robust framework for analyzing complex datasets and improves the prediction accuracy of T2DM onset. Future research should focus on validating these biomarkers and models in larger, more diverse populations to solidify their clinical utility.</p>\",\"PeriodicalId\":12447,\"journal\":{\"name\":\"Frontiers in Endocrinology\",\"volume\":\"15 \",\"pages\":\"1444282\"},\"PeriodicalIF\":3.9000,\"publicationDate\":\"2024-11-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11586166/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Frontiers in Endocrinology\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.3389/fendo.2024.1444282\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2024/1/1 0:00:00\",\"PubModel\":\"eCollection\",\"JCR\":\"Q2\",\"JCRName\":\"ENDOCRINOLOGY & METABOLISM\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Frontiers in Endocrinology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.3389/fendo.2024.1444282","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/1/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"ENDOCRINOLOGY & METABOLISM","Score":null,"Total":0}
引用次数: 0

摘要

背景:2 型糖尿病(T2DM)是以胰岛素抵抗和高血糖为特征的全球性健康问题。早期检测和准确预测 T2DM 对于有效管理和预防至关重要。本研究探讨了基于代谢组学面板数据的机器学习(ML)和可解释人工智能(XAI)方法的整合,以确定生物标志物并开发 T2DM 的预测模型:对T2DM(31人)和健康对照组(34人)的代谢组学数据进行分析,以发现生物标志物(主要是氨基酸、脂肪酸和嘌呤)并预测T2DM。采用最小绝对收缩和选择算子(LASSO)回归法进行特征选择,以提高模型的准确性和可解释性。先进的三种基于树的 ML 算法(KTBoost:KTBoost: Kernel-Tree Boosting;XGBoost: eXtreme Gradient Boosting;NGBoost:采用这些生物标志物预测 T2DM。采用SHAPLE Additive exPlanations(SHAP)方法解释代谢组学生物标志物对模型预测的影响:研究发现了与 T2DM 相关的多种代谢物,其中 LASSO 特征选择突出了重要的生物标志物。KTBoost[准确度:0.938;CI:(0.880-0.997),灵敏度:0.971;CI:(0.847-0.999),曲线下面积(AUC):0.965;CI:(0.880-0.997),灵敏度:0.971;CI:(0.847-0.999):0.965; CI: (0.937-0.994)]证明了其在利用复杂代谢组学数据预测 T2DM 方面的有效性,并取得了比其他模型更好的性能。根据 KTBoost 的 SHAP,高水平的苯丙氨酸(pla)和牛磺酸代谢物以及低浓度的半胱氨酸、天门冬氨酸和半胱氨酸与 T2DM 的存在密切相关:代谢组学分析与 XAI 的整合为预测 T2DM 提供了一种前景广阔的方法。使用基于树的算法,特别是 KTBoost,为分析复杂数据集提供了一个稳健的框架,并提高了对 T2DM 发病的预测准确性。未来的研究应侧重于在更大规模、更多样化的人群中验证这些生物标志物和模型,以巩固它们的临床实用性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Enhancing type 2 diabetes mellitus prediction by integrating metabolomics and tree-based boosting approaches.

Background: Type 2 diabetes mellitus (T2DM) is a global health problem characterized by insulin resistance and hyperglycemia. Early detection and accurate prediction of T2DM is crucial for effective management and prevention. This study explores the integration of machine learning (ML) and explainable artificial intelligence (XAI) approaches based on metabolomics panel data to identify biomarkers and develop predictive models for T2DM.

Methods: Metabolomics data from T2DM (n = 31) and healthy controls (n = 34) were analyzed for biomarker discovery (mostly amino acids, fatty acids, and purines) and T2DM prediction. Feature selection was performed using the least absolute shrinkage and selection operator (LASSO) regression to enhance the model's accuracy and interpretability. Advanced three tree-based ML algorithms (KTBoost: Kernel-Tree Boosting; XGBoost: eXtreme Gradient Boosting; NGBoost: Natural Gradient Boosting) were employed to predict T2DM using these biomarkers. The SHapley Additive exPlanations (SHAP) method was used to explain the effects of metabolomics biomarkers on the prediction of the model.

Results: The study identified multiple metabolites associated with T2DM, where LASSO feature selection highlighted important biomarkers. KTBoost [Accuracy: 0.938; CI: (0.880-0.997), Sensitivity: 0.971; CI: (0.847-0.999), Area under the Curve (AUC): 0.965; CI: (0.937-0.994)] demonstrated its effectiveness in using complex metabolomics data for T2DM prediction and achieved better performance than other models. According to KTBoost's SHAP, high levels of phenylactate (pla) and taurine metabolites, as well as low concentrations of cysteine, laspartate, and lcysteate, are strongly associated with the presence of T2DM.

Conclusion: The integration of metabolomics profiling and XAI offers a promising approach to predicting T2DM. The use of tree-based algorithms, in particular KTBoost, provides a robust framework for analyzing complex datasets and improves the prediction accuracy of T2DM onset. Future research should focus on validating these biomarkers and models in larger, more diverse populations to solidify their clinical utility.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Frontiers in Endocrinology
Frontiers in Endocrinology Medicine-Endocrinology, Diabetes and Metabolism
CiteScore
5.70
自引率
9.60%
发文量
3023
审稿时长
14 weeks
期刊介绍: Frontiers in Endocrinology is a field journal of the "Frontiers in" journal series. In today’s world, endocrinology is becoming increasingly important as it underlies many of the challenges societies face - from obesity and diabetes to reproduction, population control and aging. Endocrinology covers a broad field from basic molecular and cellular communication through to clinical care and some of the most crucial public health issues. The journal, thus, welcomes outstanding contributions in any domain of endocrinology. Frontiers in Endocrinology publishes articles on the most outstanding discoveries across a wide research spectrum of Endocrinology. The mission of Frontiers in Endocrinology is to bring all relevant Endocrinology areas together on a single platform.
期刊最新文献
Glucagon-like peptide-1 receptor agonists and the risk of erectile dysfunction: a drug target Mendelian randomization study. Mitochondria: the epigenetic regulators of ovarian aging and longevity. Relationship between liver and cardiometabolic health in type 1 diabetes. Editorial: Insights in obesity: 2023. Editorial: Pulmonary fibrosis and endocrine factors.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1