Deep phenotyping obesity using EHR data: Promise, Challenges, and Future Directions.

Xiaoyang Ruan, Shuyu Lu, Liwei Wang, Andrew Wen, Murali Sameer, Hongfang Liu
{"title":"Deep phenotyping obesity using EHR data: Promise, Challenges, and Future Directions.","authors":"Xiaoyang Ruan, Shuyu Lu, Liwei Wang, Andrew Wen, Murali Sameer, Hongfang Liu","doi":"10.1101/2024.12.06.24318608","DOIUrl":null,"url":null,"abstract":"<p><p>Obesity affects approximately 34% of adults and 15-20% of children and adolescents in the U.S, and poses significant economic and psychosocial burdens. Due to the multifaceted nature of obesity, currently patient responses to any single anti-obesity medication (AOM) vary significantly, highlighting the need for developing approaches to obesity deep phenotyping and associated precision medicine. While recent advancement in classical phenotyping-guided pharmacotherapies have shown clinical value, they are less embraced by healthcare providers within the precision medicine framework, primarily due to their operational complexity and lack of granularity. From this perspective, several recent review articles highlighted the importance of obesity deep phenotyping for personalized precision medicine. In view of the established role of electronic health record (EHR) as an important data source for clinical phenotypings, we offer an in-depth analysis of the commonly available data elements from obesity patients prior to pharmacotherapy. We also experimented with a multi-modal longitudinal deep autoencoder to explore the feasibility, data requirements, clustering patterns, and challenges associated with EHR-based obesity deep phenotyping. Our analysis indicates at least nine clusters, among which five have distinct explainable clinical relevance. Further research within larger independent cohorts to validate the reproducibility, uncover more detailed substructures and corresponding treatment response is warranted.</p><p><strong>Background: </strong>Obesity affects approximately 40% of adults and 15-20% of children and adolescents in the U.S, and poses significant economic and psychosocial burdens. Currently, patient responses to any single anti-obesity medication (AOM) vary significantly, making obesity deep phenotyping and associated precision medicine important targets of investigation.</p><p><strong>Objective: </strong>To evaluate the potential of EHR as a primary data source for obesity deep phenotyping, we conduct an in-depth analysis of the data elements and quality available from obesity patients prior to pharmacotherapy, and apply a multi-modal longitudinal deep autoencoder to investigate the feasibility, data requirements, clustering patterns, and challenges associated with EHR-based obesity deep phenotyping.</p><p><strong>Methods: </strong>We analyzed 53,688 pre-AOM periods from 32,969 patients with obesity or overweight who underwent medium- to long-term AOM treatment. A total of 92 lab and vital measurements, along with 79 ICD-derived clinical classifications software (CCS) codes recorded within one year prior to AOM treatment, were used to train a gated recurrent unit with decay based longitudinal autoencoder (GRU-D-AE) to generate dense embeddings for each pre-AOM record. principal component analysis (PCA) and gaussian mixture modeling (GMM) were applied to identify clusters.</p><p><strong>Results: </strong>Our analysis identified at least nine clusters, with five exhibiting distinct and explainable clinical relevance. Certain clusters show characteristics overlapping with phenotypes from traditional phenotyping strategy. Results from multiple training folds demonstrated stable clustering patterns in two-dimensional space and reproducible clinical significance. However, challenges persist regarding the stability of missing data imputation across folds, maintaining consistency in input features, and effectively visualizing complex diseases in low-dimensional spaces.</p><p><strong>Conclusion: </strong>In this proof-of-concept study, we demonstrated longitudinal EHR as a valuable resource for deep phenotyping the pre-AOM period at per patient visit level. Our analysis revealed the presence of clusters with distinct clinical significance, which could have implications in AOM treatment options. Further research using larger, independent cohorts is necessary to validate the reproducibility and clinical relevance of these clusters, uncover more detailed substructures and corresponding AOM treatment responses.</p>","PeriodicalId":94281,"journal":{"name":"medRxiv : the preprint server for health sciences","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11643233/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"medRxiv : the preprint server for health sciences","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1101/2024.12.06.24318608","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Obesity affects approximately 34% of adults and 15-20% of children and adolescents in the U.S, and poses significant economic and psychosocial burdens. Due to the multifaceted nature of obesity, currently patient responses to any single anti-obesity medication (AOM) vary significantly, highlighting the need for developing approaches to obesity deep phenotyping and associated precision medicine. While recent advancement in classical phenotyping-guided pharmacotherapies have shown clinical value, they are less embraced by healthcare providers within the precision medicine framework, primarily due to their operational complexity and lack of granularity. From this perspective, several recent review articles highlighted the importance of obesity deep phenotyping for personalized precision medicine. In view of the established role of electronic health record (EHR) as an important data source for clinical phenotypings, we offer an in-depth analysis of the commonly available data elements from obesity patients prior to pharmacotherapy. We also experimented with a multi-modal longitudinal deep autoencoder to explore the feasibility, data requirements, clustering patterns, and challenges associated with EHR-based obesity deep phenotyping. Our analysis indicates at least nine clusters, among which five have distinct explainable clinical relevance. Further research within larger independent cohorts to validate the reproducibility, uncover more detailed substructures and corresponding treatment response is warranted.

Background: Obesity affects approximately 40% of adults and 15-20% of children and adolescents in the U.S, and poses significant economic and psychosocial burdens. Currently, patient responses to any single anti-obesity medication (AOM) vary significantly, making obesity deep phenotyping and associated precision medicine important targets of investigation.

Objective: To evaluate the potential of EHR as a primary data source for obesity deep phenotyping, we conduct an in-depth analysis of the data elements and quality available from obesity patients prior to pharmacotherapy, and apply a multi-modal longitudinal deep autoencoder to investigate the feasibility, data requirements, clustering patterns, and challenges associated with EHR-based obesity deep phenotyping.

Methods: We analyzed 53,688 pre-AOM periods from 32,969 patients with obesity or overweight who underwent medium- to long-term AOM treatment. A total of 92 lab and vital measurements, along with 79 ICD-derived clinical classifications software (CCS) codes recorded within one year prior to AOM treatment, were used to train a gated recurrent unit with decay based longitudinal autoencoder (GRU-D-AE) to generate dense embeddings for each pre-AOM record. principal component analysis (PCA) and gaussian mixture modeling (GMM) were applied to identify clusters.

Results: Our analysis identified at least nine clusters, with five exhibiting distinct and explainable clinical relevance. Certain clusters show characteristics overlapping with phenotypes from traditional phenotyping strategy. Results from multiple training folds demonstrated stable clustering patterns in two-dimensional space and reproducible clinical significance. However, challenges persist regarding the stability of missing data imputation across folds, maintaining consistency in input features, and effectively visualizing complex diseases in low-dimensional spaces.

Conclusion: In this proof-of-concept study, we demonstrated longitudinal EHR as a valuable resource for deep phenotyping the pre-AOM period at per patient visit level. Our analysis revealed the presence of clusters with distinct clinical significance, which could have implications in AOM treatment options. Further research using larger, independent cohorts is necessary to validate the reproducibility and clinical relevance of these clusters, uncover more detailed substructures and corresponding AOM treatment responses.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
利用电子病历数据进行肥胖症深度表型分析:前景、挑战和未来方向。
在美国,肥胖影响了大约34%的成年人和15-20%的儿童和青少年,并造成了严重的经济和社会心理负担。由于肥胖的多面性,目前患者对任何一种抗肥胖药物(AOM)的反应都有很大差异,这凸显了开发肥胖深度表型和相关精准医学方法的必要性。虽然经典表型引导药物治疗的最新进展已经显示出临床价值,但它们在精准医疗框架内较少被医疗保健提供者所接受,主要是由于其操作复杂性和缺乏粒度。从这个角度来看,最近的几篇综述文章强调了肥胖深度表型对个性化精准医疗的重要性。鉴于电子健康记录(EHR)作为临床表型的重要数据源的既定作用,我们对肥胖患者在药物治疗前常见的数据元素进行了深入分析。我们还试验了一个多模态纵向深度自动编码器,以探索基于ehr的肥胖深度表型相关的可行性、数据要求、聚类模式和挑战。我们的分析表明至少有九个集群,其中五个具有明显的可解释的临床相关性。在更大的独立队列中进行进一步的研究,以验证可重复性,揭示更详细的亚结构和相应的治疗反应是必要的。背景:在美国,肥胖影响了大约40%的成年人和15-20%的儿童和青少年,并造成了重大的经济和社会心理负担。目前,患者对任何一种抗肥胖药物(AOM)的反应差异很大,这使得肥胖深度表型和相关精准医学成为研究的重要目标。目的:为了评估EHR作为肥胖深度表型的主要数据源的潜力,我们对肥胖患者在药物治疗前的数据元素和质量进行了深入分析,并应用多模态纵向深度自编码器来研究基于EHR的肥胖深度表型的可行性、数据要求、聚类模式和挑战。方法:我们分析了32969例肥胖或超重患者的53688个AOM前期,这些患者接受了中长期AOM治疗。总共92个实验室和重要测量数据,以及79个icd衍生的临床分类软件(CCS)代码,在AOM治疗前一年内记录,用于训练一个门控复发单元,该单元具有基于衰减的纵向自编码器(GRU-D-AE),以生成每个AOM前记录的密集嵌入。采用主成分分析(PCA)和高斯混合模型(GMM)对聚类进行识别。结果:我们的分析确定了至少九个集群,其中五个表现出明显的和可解释的临床相关性。某些集群表现出与传统表型策略的表型重叠的特征。多个训练折叠的结果在二维空间中表现出稳定的聚类模式和可重复性的临床意义。然而,关于跨折叠缺失数据输入的稳定性,保持输入特征的一致性以及在低维空间中有效地可视化复杂疾病方面的挑战仍然存在。结论:在这项概念验证研究中,我们证明了纵向电子病历是一种有价值的资源,可以在每个患者就诊水平上对aom前时期进行深度表型分析。我们的分析揭示了具有明显临床意义的群集的存在,这可能对AOM的治疗方案有影响。进一步的研究需要使用更大的、独立的队列来验证这些集群的可重复性和临床相关性,揭示更详细的亚结构和相应的AOM治疗反应。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Should We Keep Changing the Clock? Characterizing Causal Effects of Daylight Saving Time on Behavior and Physiology. A norm-anchored structural deviation framework for characterizing cognitive heterogeneity in schizophrenia. Human RIG-I Antiviral Deficiency Caused by a Dominant-Negative Variant Locked in a Signaling-Inactive State. Immunotherapies for risk reduction in age-associated neurodegenerative diseases: impact of sex and treatment duration. Gene-Morphology Alignment via Graph-Constrained Latent Modeling for Molecular Subtype Prediction from Histopathology in Pancreatic Cancer.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1