预测《京都基因和基因组百科全书》中定义的所有途径和相关化合物条目的途径参与情况。

IF 3.4 3区 生物学 Q2 BIOCHEMISTRY & MOLECULAR BIOLOGY Metabolites Pub Date : 2024-10-27 DOI:10.3390/metabo14110582
Erik D Huckvale, Hunter N B Moseley
{"title":"预测《京都基因和基因组百科全书》中定义的所有途径和相关化合物条目的途径参与情况。","authors":"Erik D Huckvale, Hunter N B Moseley","doi":"10.3390/metabo14110582","DOIUrl":null,"url":null,"abstract":"<p><p><b>Background/Objectives</b>: Predicting the biochemical pathway involvement of a compound could facilitate the interpretation of biological and biomedical research. Prior prediction approaches have largely focused on metabolism, training machine learning models to solely predict based on metabolic pathways. However, there are many other types of pathways in cells and organisms that are of interest to biologists. <b>Methods</b>: While several publications have made use of the metabolites and metabolic pathways available in the Kyoto Encyclopedia of Genes and Genomes (KEGG), we downloaded all the compound entries with pathway annotations available in the KEGG. From these data, we constructed a dataset where each entry contained features representing compounds combined with features representing pathways, followed by a binary label indicating whether the given compound is associated with the given pathway. We trained multi-layer perceptron binary classifiers on variations of this dataset. <b>Results</b>: The models trained on 6485 KEGG compounds and 502 pathways scored an overall mean Matthews correlation coefficient (MCC) performance of 0.847, a median MCC of 0.848, and a standard deviation of 0.0098. <b>Conclusions</b>: This performance on all 502 KEGG pathways represents a roughly 6% improvement over the performance of models trained on only the 184 KEGG metabolic pathways, which had a mean MCC of 0.800 and a standard deviation of 0.021. These results demonstrate the capability to effectively predict biochemical pathways in general, in addition to those specifically related to metabolism. Moreover, the improvement in the performance demonstrates additional transfer learning with the inclusion of non-metabolic pathways.</p>","PeriodicalId":18496,"journal":{"name":"Metabolites","volume":"14 11","pages":""},"PeriodicalIF":3.4000,"publicationDate":"2024-10-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11596622/pdf/","citationCount":"0","resultStr":"{\"title\":\"Predicting the Pathway Involvement of All Pathway and Associated Compound Entries Defined in the Kyoto Encyclopedia of Genes and Genomes.\",\"authors\":\"Erik D Huckvale, Hunter N B Moseley\",\"doi\":\"10.3390/metabo14110582\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p><b>Background/Objectives</b>: Predicting the biochemical pathway involvement of a compound could facilitate the interpretation of biological and biomedical research. Prior prediction approaches have largely focused on metabolism, training machine learning models to solely predict based on metabolic pathways. However, there are many other types of pathways in cells and organisms that are of interest to biologists. <b>Methods</b>: While several publications have made use of the metabolites and metabolic pathways available in the Kyoto Encyclopedia of Genes and Genomes (KEGG), we downloaded all the compound entries with pathway annotations available in the KEGG. From these data, we constructed a dataset where each entry contained features representing compounds combined with features representing pathways, followed by a binary label indicating whether the given compound is associated with the given pathway. We trained multi-layer perceptron binary classifiers on variations of this dataset. <b>Results</b>: The models trained on 6485 KEGG compounds and 502 pathways scored an overall mean Matthews correlation coefficient (MCC) performance of 0.847, a median MCC of 0.848, and a standard deviation of 0.0098. <b>Conclusions</b>: This performance on all 502 KEGG pathways represents a roughly 6% improvement over the performance of models trained on only the 184 KEGG metabolic pathways, which had a mean MCC of 0.800 and a standard deviation of 0.021. These results demonstrate the capability to effectively predict biochemical pathways in general, in addition to those specifically related to metabolism. Moreover, the improvement in the performance demonstrates additional transfer learning with the inclusion of non-metabolic pathways.</p>\",\"PeriodicalId\":18496,\"journal\":{\"name\":\"Metabolites\",\"volume\":\"14 11\",\"pages\":\"\"},\"PeriodicalIF\":3.4000,\"publicationDate\":\"2024-10-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11596622/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Metabolites\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.3390/metabo14110582\",\"RegionNum\":3,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"BIOCHEMISTRY & MOLECULAR BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Metabolites","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.3390/metabo14110582","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
引用次数: 0

摘要

背景/目的:预测化合物的生化途径有助于解释生物和生物医学研究。之前的预测方法主要集中在新陈代谢方面,只根据新陈代谢途径训练机器学习模型进行预测。然而,生物学家感兴趣的细胞和生物体内还有许多其他类型的途径。方法:虽然有一些出版物利用了《京都基因组百科全书》(KEGG)中的代谢物和代谢途径,但我们下载了 KEGG 中所有带有途径注释的化合物条目。根据这些数据,我们构建了一个数据集,其中每个条目都包含代表化合物的特征和代表通路的特征,然后是一个二进制标签,表示给定化合物是否与给定通路相关联。我们在该数据集的变体上训练了多层感知器二元分类器。结果在 6485 个 KEGG 化合物和 502 条通路上训练的模型总平均马修斯相关系数 (MCC) 为 0.847,中位数为 0.848,标准偏差为 0.0098。结论在所有 502 条 KEGG 通路上的表现比仅在 184 条 KEGG 代谢通路上训练的模型的表现提高了约 6%,后者的 MCC 平均值为 0.800,标准偏差为 0.021。这些结果表明,除了与新陈代谢特别相关的生化途径外,还能有效预测一般的生化途径。此外,性能的提高还表明,在纳入非代谢途径后,还可以进行更多的迁移学习。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Predicting the Pathway Involvement of All Pathway and Associated Compound Entries Defined in the Kyoto Encyclopedia of Genes and Genomes.

Background/Objectives: Predicting the biochemical pathway involvement of a compound could facilitate the interpretation of biological and biomedical research. Prior prediction approaches have largely focused on metabolism, training machine learning models to solely predict based on metabolic pathways. However, there are many other types of pathways in cells and organisms that are of interest to biologists. Methods: While several publications have made use of the metabolites and metabolic pathways available in the Kyoto Encyclopedia of Genes and Genomes (KEGG), we downloaded all the compound entries with pathway annotations available in the KEGG. From these data, we constructed a dataset where each entry contained features representing compounds combined with features representing pathways, followed by a binary label indicating whether the given compound is associated with the given pathway. We trained multi-layer perceptron binary classifiers on variations of this dataset. Results: The models trained on 6485 KEGG compounds and 502 pathways scored an overall mean Matthews correlation coefficient (MCC) performance of 0.847, a median MCC of 0.848, and a standard deviation of 0.0098. Conclusions: This performance on all 502 KEGG pathways represents a roughly 6% improvement over the performance of models trained on only the 184 KEGG metabolic pathways, which had a mean MCC of 0.800 and a standard deviation of 0.021. These results demonstrate the capability to effectively predict biochemical pathways in general, in addition to those specifically related to metabolism. Moreover, the improvement in the performance demonstrates additional transfer learning with the inclusion of non-metabolic pathways.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Metabolites
Metabolites Biochemistry, Genetics and Molecular Biology-Molecular Biology
CiteScore
5.70
自引率
7.30%
发文量
1070
审稿时长
17.17 days
期刊介绍: Metabolites (ISSN 2218-1989) is an international, peer-reviewed open access journal of metabolism and metabolomics. Metabolites publishes original research articles and review articles in all molecular aspects of metabolism relevant to the fields of metabolomics, metabolic biochemistry, computational and systems biology, biotechnology and medicine, with a particular focus on the biological roles of metabolites and small molecule biomarkers. Metabolites encourages scientists to publish their experimental and theoretical results in as much detail as possible. Therefore, there is no restriction on article length. Sufficient experimental details must be provided to enable the results to be accurately reproduced. Electronic material representing additional figures, materials and methods explanation, or supporting results and evidence can be submitted with the main manuscript as supplementary material.
期刊最新文献
Leflunomide-Induced Weight Loss: Involvement of DAHPS Activity and Synthesis of Aromatic Amino Acids. Identification of Spatial Specific Lipid Metabolic Signatures in Long-Standing Diabetic Kidney Disease. Rapid Determination of Methamphetamine, Methylenedioxymethamphetamine, Methadone, Ketamine, Cocaine, and New Psychoactive Substances in Urine Samples Using Comprehensive Two-Dimensional Gas Chromatography. Influence of Uric Acid on Vascular and Cognitive Functions: Evidence for an Ambivalent Relationship. Type 1 Diabetes and Cataracts: Investigating Mediating Effects of Serum Metabolites Using Bidirectional Mendelian Randomization.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1