AI-driven Integration of Multimodal Imaging Pixel Data and Genome-wide Genotype Data Enhances Precision Health for Type 2 Diabetes: Insights from a Large-scale Biobank Study

medRxiv - Health Informatics Pub Date : 2024-07-26 DOI:10.1101/2024.07.25.24310650

Yi-Jia Huang, Chun-houh Chen, Hsin-Chou Yang

{"title":"AI-driven Integration of Multimodal Imaging Pixel Data and Genome-wide Genotype Data Enhances Precision Health for Type 2 Diabetes: Insights from a Large-scale Biobank Study","authors":"Yi-Jia Huang, Chun-houh Chen, Hsin-Chou Yang","doi":"10.1101/2024.07.25.24310650","DOIUrl":null,"url":null,"abstract":"The rising prevalence of Type 2 Diabetes (T2D) presents a critical global health challenge. Effective risk assessment and prevention strategies not only improve patient quality of life but also alleviate national healthcare expenditures. The integration of medical imaging and genetic data from extensive biobanks, driven by artificial intelligence (AI), is revolutionizing precision and smart health initiatives.\nIn this study, we applied these principles to T2D by analyzing medical images (abdominal ultrasonography and bone density scans) alongside whole-genome single nucleotide variations in 17,785 Han Chinese participants from the Taiwan Biobank. Rigorous data cleaning and preprocessing procedures were applied. Imaging analysis utilized densely connected convolutional neural networks, augmented by graph neural networks to account for intra-individual image dependencies, while genetic analysis employed Bayesian statistical learning to derive polygenic risk scores (PRS). These modalities were integrated through eXtreme Gradient Boosting (XGBoost), yielding several key findings.\nFirst, pixel-based image analysis outperformed feature-centric image analysis in accuracy, automation, and cost efficiency. Second, multi-modality analysis significantly enhanced predictive accuracy compared to single-modality approaches. Third, this comprehensive approach, combining medical imaging, genetic, and demographic data, represents a promising frontier for fusion modeling, integrating AI and statistical learning techniques in disease risk assessment. Our model achieved an Area under the Receiver Operating Characteristic Curve (AUC) of 0.944, with an accuracy of 0.875, sensitivity of 0.882, specificity of 0.875, and a Youden index of 0.754. Additionally, the analysis revealed significant positive correlations between the multi-image risk score (MRS) and T2D, as well as between the PRS and T2D, identifying high-risk subgroups within the cohort.\nThis study pioneers the integration of multimodal imaging pixels and genome-wide genetic variation data for precise T2D risk assessment, advancing the understanding of precision and smart health.","PeriodicalId":501454,"journal":{"name":"medRxiv - Health Informatics","volume":"7 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"medRxiv - Health Informatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1101/2024.07.25.24310650","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

The rising prevalence of Type 2 Diabetes (T2D) presents a critical global health challenge. Effective risk assessment and prevention strategies not only improve patient quality of life but also alleviate national healthcare expenditures. The integration of medical imaging and genetic data from extensive biobanks, driven by artificial intelligence (AI), is revolutionizing precision and smart health initiatives. In this study, we applied these principles to T2D by analyzing medical images (abdominal ultrasonography and bone density scans) alongside whole-genome single nucleotide variations in 17,785 Han Chinese participants from the Taiwan Biobank. Rigorous data cleaning and preprocessing procedures were applied. Imaging analysis utilized densely connected convolutional neural networks, augmented by graph neural networks to account for intra-individual image dependencies, while genetic analysis employed Bayesian statistical learning to derive polygenic risk scores (PRS). These modalities were integrated through eXtreme Gradient Boosting (XGBoost), yielding several key findings. First, pixel-based image analysis outperformed feature-centric image analysis in accuracy, automation, and cost efficiency. Second, multi-modality analysis significantly enhanced predictive accuracy compared to single-modality approaches. Third, this comprehensive approach, combining medical imaging, genetic, and demographic data, represents a promising frontier for fusion modeling, integrating AI and statistical learning techniques in disease risk assessment. Our model achieved an Area under the Receiver Operating Characteristic Curve (AUC) of 0.944, with an accuracy of 0.875, sensitivity of 0.882, specificity of 0.875, and a Youden index of 0.754. Additionally, the analysis revealed significant positive correlations between the multi-image risk score (MRS) and T2D, as well as between the PRS and T2D, identifying high-risk subgroups within the cohort. This study pioneers the integration of multimodal imaging pixels and genome-wide genetic variation data for precise T2D risk assessment, advancing the understanding of precision and smart health.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

人工智能驱动的多模态成像像素数据与全基因组基因型数据的整合提高了 2 型糖尿病的精准医疗水平：大规模生物库研究的启示

2 型糖尿病（T2D）发病率的不断上升给全球健康带来了严峻的挑战。有效的风险评估和预防策略不仅能提高患者的生活质量，还能减轻国家医疗开支。在本研究中，我们将这些原理应用于 T2D，分析了台湾生物库中 17785 名汉族参与者的医学影像（腹部超声波和骨密度扫描）以及全基因组单核苷酸变异。研究采用了严格的数据清理和预处理程序。成像分析采用了高密度连接的卷积神经网络，并通过图神经网络进行增强，以考虑个体内部的图像依赖性，而遗传分析则采用了贝叶斯统计学习方法，以得出多基因风险评分（PRS）。首先，基于像素的图像分析在准确性、自动化和成本效率方面都优于以特征为中心的图像分析。其次，与单一模式方法相比，多模式分析显著提高了预测准确性。第三，这种将医学影像、基因和人口统计学数据相结合的综合方法代表了融合建模的前沿领域，将人工智能和统计学习技术整合到了疾病风险评估中。我们的模型的接收者工作特征曲线下面积（AUC）为 0.944，准确度为 0.875，灵敏度为 0.882，特异度为 0.875，尤登指数为 0.754。这项研究开创性地将多模态成像像素和全基因组遗传变异数据整合在一起，用于精确的 T2D 风险评估，推动了对精准健康和智能健康的理解。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

medRxiv - Health Informatics

自引率

0.00%

发文量