利用遗传和临床变量描述疾病特征:数据分析方法

Madhuri Gollapalli, Harsh Anand, Satish Mahadevan Srinivasan
{"title":"利用遗传和临床变量描述疾病特征:数据分析方法","authors":"Madhuri Gollapalli, Harsh Anand, Satish Mahadevan Srinivasan","doi":"10.1002/qub2.46","DOIUrl":null,"url":null,"abstract":"Predictive analytics is crucial in precision medicine for personalized patient care. To aid in precision medicine, this study identifies a subset of genetic and clinical variables that can serve as predictors for classifying diseased tissues/disease types. To achieve this, experiments were performed on diseased tissues obtained from the L1000 dataset to assess differences in the functionality and predictive capabilities of genetic and clinical variables. In this study, the k‐means technique was used for clustering the diseased tissue types, and the multinomial logistic regression (MLR) technique was applied for classifying the diseased tissue types. Dimensionality reduction techniques including principal component analysis and Boruta are used extensively to reduce the dimensionality of genetic and clinical variables. The results showed that landmark genes performed slightly better in clustering diseased tissue types compared to any random set of 978 non‐landmark genes, and the difference is statistically significant. Furthermore, it was evident that both clinical and genetic variables were important in predicting the diseased tissue types. The top three clinical predictors for predicting diseased tissue types were identified as morphology, gender, and age of diagnosis. Additionally, this study explored the possibility of using the latent representations of the clusters of landmark and non‐landmark genes as predictors for an MLR classifier. The classification models built using MLR revealed that landmark genes can serve as a subset of genetic variables and/or as a proxy for clinical variables. This study concludes that combining predictive analytics with dimensionality reduction effectively identifies key predictors in precision medicine, enhancing diagnostic accuracy.","PeriodicalId":508846,"journal":{"name":"Quantitative Biology","volume":"55 3","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Characterizing diseases using genetic and clinical variables: A data analytics approach\",\"authors\":\"Madhuri Gollapalli, Harsh Anand, Satish Mahadevan Srinivasan\",\"doi\":\"10.1002/qub2.46\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Predictive analytics is crucial in precision medicine for personalized patient care. To aid in precision medicine, this study identifies a subset of genetic and clinical variables that can serve as predictors for classifying diseased tissues/disease types. To achieve this, experiments were performed on diseased tissues obtained from the L1000 dataset to assess differences in the functionality and predictive capabilities of genetic and clinical variables. In this study, the k‐means technique was used for clustering the diseased tissue types, and the multinomial logistic regression (MLR) technique was applied for classifying the diseased tissue types. Dimensionality reduction techniques including principal component analysis and Boruta are used extensively to reduce the dimensionality of genetic and clinical variables. The results showed that landmark genes performed slightly better in clustering diseased tissue types compared to any random set of 978 non‐landmark genes, and the difference is statistically significant. Furthermore, it was evident that both clinical and genetic variables were important in predicting the diseased tissue types. The top three clinical predictors for predicting diseased tissue types were identified as morphology, gender, and age of diagnosis. Additionally, this study explored the possibility of using the latent representations of the clusters of landmark and non‐landmark genes as predictors for an MLR classifier. The classification models built using MLR revealed that landmark genes can serve as a subset of genetic variables and/or as a proxy for clinical variables. This study concludes that combining predictive analytics with dimensionality reduction effectively identifies key predictors in precision medicine, enhancing diagnostic accuracy.\",\"PeriodicalId\":508846,\"journal\":{\"name\":\"Quantitative Biology\",\"volume\":\"55 3\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-05-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Quantitative Biology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1002/qub2.46\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Quantitative Biology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1002/qub2.46","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

预测分析在精准医疗中对个性化患者护理至关重要。为了帮助精准医疗,本研究确定了可作为疾病组织/疾病类型分类预测因子的遗传和临床变量子集。为此,我们对从 L1000 数据集中获得的病变组织进行了实验,以评估遗传和临床变量在功能和预测能力上的差异。在这项研究中,使用 k-means 技术对患病组织类型进行聚类,并使用多项式逻辑回归(MLR)技术对患病组织类型进行分类。降维技术包括主成分分析和 Boruta,广泛用于降低遗传和临床变量的维度。结果表明,与任意一组 978 个非地标基因相比,地标基因在疾病组织类型的聚类中表现略好,且差异具有统计学意义。此外,临床变量和遗传变异显然对预测患病组织类型都很重要。预测患病组织类型的前三位临床预测因子分别是形态学、性别和诊断年龄。此外,本研究还探索了将标志性基因和非标志性基因群的潜在表征作为 MLR 分类器预测因子的可能性。使用 MLR 建立的分类模型显示,地标基因可以作为遗传变量的子集和/或临床变量的替代物。本研究的结论是,将预测分析与降维相结合,可有效识别精准医疗中的关键预测因子,提高诊断准确性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Characterizing diseases using genetic and clinical variables: A data analytics approach
Predictive analytics is crucial in precision medicine for personalized patient care. To aid in precision medicine, this study identifies a subset of genetic and clinical variables that can serve as predictors for classifying diseased tissues/disease types. To achieve this, experiments were performed on diseased tissues obtained from the L1000 dataset to assess differences in the functionality and predictive capabilities of genetic and clinical variables. In this study, the k‐means technique was used for clustering the diseased tissue types, and the multinomial logistic regression (MLR) technique was applied for classifying the diseased tissue types. Dimensionality reduction techniques including principal component analysis and Boruta are used extensively to reduce the dimensionality of genetic and clinical variables. The results showed that landmark genes performed slightly better in clustering diseased tissue types compared to any random set of 978 non‐landmark genes, and the difference is statistically significant. Furthermore, it was evident that both clinical and genetic variables were important in predicting the diseased tissue types. The top three clinical predictors for predicting diseased tissue types were identified as morphology, gender, and age of diagnosis. Additionally, this study explored the possibility of using the latent representations of the clusters of landmark and non‐landmark genes as predictors for an MLR classifier. The classification models built using MLR revealed that landmark genes can serve as a subset of genetic variables and/or as a proxy for clinical variables. This study concludes that combining predictive analytics with dimensionality reduction effectively identifies key predictors in precision medicine, enhancing diagnostic accuracy.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Deterministic modelling of asymptomatic spread and disease stage progression in vaccine preventable infectious diseases Perspectives on benchmarking foundation models for network biology In silico designing and optimization of anti‐epidermal growth factor receptor scaffolds by complementary‐determining regions‐grafting technique Mathematical modeling of evolution of cell networks in epithelial tissues A  substructure‐aware graph neural network incorporating relation features for drug–drug interaction prediction
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1