Jerry Z Yao, Igor F Tsigelny, Santosh Kesari, Valentina L Kouznetsova
{"title":"通过代谢物分析和机器学习来诊断卵巢癌。","authors":"Jerry Z Yao, Igor F Tsigelny, Santosh Kesari, Valentina L Kouznetsova","doi":"10.1093/intbio/zyad005","DOIUrl":null,"url":null,"abstract":"<p><p>Ovarian cancer (OC) is the second most common cancer of the female reproductive system. Due to the asymptomatic nature of early stages of OC and an increasingly poor prognosis in later stages, methods of screening for OC are much desired. Furthermore, screening and diagnosis processes, in order to justify use on asymptomatic patients, must be convenient and non-invasive. Recent developments in machine-learning technologies have made this possible via techniques in the field of metabolomics. The objective of this research was to use existing metabolomics data on OC and various analytic methods to develop a machine-learning model for the classification of potentially OC-related metabolite biomarkers. Pathway analysis and metabolite-set enrichment analysis were performed on gathered metabolite sets. Quantitative molecular descriptors were then used with various machine-learning classifiers for the diagnostics of OC using related metabolites. We elucidated that the metabolites associated with OC used for machine-learning models are involved in five metabolic pathways linked to OC: Nicotinate and Nicotinamide Metabolism, Glycolysis/Gluconeogenesis, Aminoacyl-tRNA Biosynthesis, Valine, Leucine and Isoleucine Biosynthesis, and Alanine, Aspartate and Glutamate Metabolism. Several classification models for the identification of OC using related metabolites were created and their accuracies were confirmed through testing with 10-fold cross-validation. The most accurate model was able to achieve 85.29% accuracy. The elucidation of biological pathways specific to OC using metabolic data and the observation of changes in these pathways in patients have the potential to contribute to the development of screening techniques for OC. Our results demonstrate the possibility of development of the machine-learning models for OC diagnostics using metabolomics data.</p>","PeriodicalId":80,"journal":{"name":"Integrative Biology","volume":"15 ","pages":""},"PeriodicalIF":1.5000,"publicationDate":"2023-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Diagnostics of ovarian cancer via metabolite analysis and machine learning.\",\"authors\":\"Jerry Z Yao, Igor F Tsigelny, Santosh Kesari, Valentina L Kouznetsova\",\"doi\":\"10.1093/intbio/zyad005\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Ovarian cancer (OC) is the second most common cancer of the female reproductive system. Due to the asymptomatic nature of early stages of OC and an increasingly poor prognosis in later stages, methods of screening for OC are much desired. Furthermore, screening and diagnosis processes, in order to justify use on asymptomatic patients, must be convenient and non-invasive. Recent developments in machine-learning technologies have made this possible via techniques in the field of metabolomics. The objective of this research was to use existing metabolomics data on OC and various analytic methods to develop a machine-learning model for the classification of potentially OC-related metabolite biomarkers. Pathway analysis and metabolite-set enrichment analysis were performed on gathered metabolite sets. Quantitative molecular descriptors were then used with various machine-learning classifiers for the diagnostics of OC using related metabolites. We elucidated that the metabolites associated with OC used for machine-learning models are involved in five metabolic pathways linked to OC: Nicotinate and Nicotinamide Metabolism, Glycolysis/Gluconeogenesis, Aminoacyl-tRNA Biosynthesis, Valine, Leucine and Isoleucine Biosynthesis, and Alanine, Aspartate and Glutamate Metabolism. Several classification models for the identification of OC using related metabolites were created and their accuracies were confirmed through testing with 10-fold cross-validation. The most accurate model was able to achieve 85.29% accuracy. The elucidation of biological pathways specific to OC using metabolic data and the observation of changes in these pathways in patients have the potential to contribute to the development of screening techniques for OC. Our results demonstrate the possibility of development of the machine-learning models for OC diagnostics using metabolomics data.</p>\",\"PeriodicalId\":80,\"journal\":{\"name\":\"Integrative Biology\",\"volume\":\"15 \",\"pages\":\"\"},\"PeriodicalIF\":1.5000,\"publicationDate\":\"2023-04-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Integrative Biology\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.1093/intbio/zyad005\",\"RegionNum\":4,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"CELL BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Integrative Biology","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1093/intbio/zyad005","RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"CELL BIOLOGY","Score":null,"Total":0}
Diagnostics of ovarian cancer via metabolite analysis and machine learning.
Ovarian cancer (OC) is the second most common cancer of the female reproductive system. Due to the asymptomatic nature of early stages of OC and an increasingly poor prognosis in later stages, methods of screening for OC are much desired. Furthermore, screening and diagnosis processes, in order to justify use on asymptomatic patients, must be convenient and non-invasive. Recent developments in machine-learning technologies have made this possible via techniques in the field of metabolomics. The objective of this research was to use existing metabolomics data on OC and various analytic methods to develop a machine-learning model for the classification of potentially OC-related metabolite biomarkers. Pathway analysis and metabolite-set enrichment analysis were performed on gathered metabolite sets. Quantitative molecular descriptors were then used with various machine-learning classifiers for the diagnostics of OC using related metabolites. We elucidated that the metabolites associated with OC used for machine-learning models are involved in five metabolic pathways linked to OC: Nicotinate and Nicotinamide Metabolism, Glycolysis/Gluconeogenesis, Aminoacyl-tRNA Biosynthesis, Valine, Leucine and Isoleucine Biosynthesis, and Alanine, Aspartate and Glutamate Metabolism. Several classification models for the identification of OC using related metabolites were created and their accuracies were confirmed through testing with 10-fold cross-validation. The most accurate model was able to achieve 85.29% accuracy. The elucidation of biological pathways specific to OC using metabolic data and the observation of changes in these pathways in patients have the potential to contribute to the development of screening techniques for OC. Our results demonstrate the possibility of development of the machine-learning models for OC diagnostics using metabolomics data.
期刊介绍:
Integrative Biology publishes original biological research based on innovative experimental and theoretical methodologies that answer biological questions. The journal is multi- and inter-disciplinary, calling upon expertise and technologies from the physical sciences, engineering, computation, imaging, and mathematics to address critical questions in biological systems.
Research using experimental or computational quantitative technologies to characterise biological systems at the molecular, cellular, tissue and population levels is welcomed. Of particular interest are submissions contributing to quantitative understanding of how component properties at one level in the dimensional scale (nano to micro) determine system behaviour at a higher level of complexity.
Studies of synthetic systems, whether used to elucidate fundamental principles of biological function or as the basis for novel applications are also of interest.