Izzy Newsham, Marcin Sendera, Sri Ganesh Jammula, Shamith A Samarajiwa
{"title":"利用可解释的机器学习揭示癌症特异性 DNA 甲基化模式,实现癌症的早期检测和诊断。","authors":"Izzy Newsham, Marcin Sendera, Sri Ganesh Jammula, Shamith A Samarajiwa","doi":"10.1093/biomethods/bpae028","DOIUrl":null,"url":null,"abstract":"<p><p>Cancer, a collection of more than two hundred different diseases, remains a leading cause of morbidity and mortality worldwide. Usually detected at the advanced stages of disease, metastatic cancer accounts for 90% of cancer-associated deaths. Therefore, the early detection of cancer, combined with current therapies, would have a significant impact on survival and treatment of various cancer types. Epigenetic changes such as DNA methylation are some of the early events underlying carcinogenesis. Here, we report on an interpretable machine learning model that can classify 13 cancer types as well as non-cancer tissue samples using only DNA methylome data, with 98.2% accuracy. We utilize the features identified by this model to develop EMethylNET, a robust model consisting of an XGBoost model that provides information to a deep neural network that can generalize to independent data sets. We also demonstrate that the methylation-associated genomic loci detected by the classifier are associated with genes, pathways and networks involved in cancer, providing insights into the epigenomic regulation of carcinogenesis.</p>","PeriodicalId":36528,"journal":{"name":"Biology Methods and Protocols","volume":"9 1","pages":"bpae028"},"PeriodicalIF":2.5000,"publicationDate":"2024-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11186673/pdf/","citationCount":"0","resultStr":"{\"title\":\"Early detection and diagnosis of cancer with interpretable machine learning to uncover cancer-specific DNA methylation patterns.\",\"authors\":\"Izzy Newsham, Marcin Sendera, Sri Ganesh Jammula, Shamith A Samarajiwa\",\"doi\":\"10.1093/biomethods/bpae028\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Cancer, a collection of more than two hundred different diseases, remains a leading cause of morbidity and mortality worldwide. Usually detected at the advanced stages of disease, metastatic cancer accounts for 90% of cancer-associated deaths. Therefore, the early detection of cancer, combined with current therapies, would have a significant impact on survival and treatment of various cancer types. Epigenetic changes such as DNA methylation are some of the early events underlying carcinogenesis. Here, we report on an interpretable machine learning model that can classify 13 cancer types as well as non-cancer tissue samples using only DNA methylome data, with 98.2% accuracy. We utilize the features identified by this model to develop EMethylNET, a robust model consisting of an XGBoost model that provides information to a deep neural network that can generalize to independent data sets. We also demonstrate that the methylation-associated genomic loci detected by the classifier are associated with genes, pathways and networks involved in cancer, providing insights into the epigenomic regulation of carcinogenesis.</p>\",\"PeriodicalId\":36528,\"journal\":{\"name\":\"Biology Methods and Protocols\",\"volume\":\"9 1\",\"pages\":\"bpae028\"},\"PeriodicalIF\":2.5000,\"publicationDate\":\"2024-06-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11186673/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Biology Methods and Protocols\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1093/biomethods/bpae028\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2024/1/1 0:00:00\",\"PubModel\":\"eCollection\",\"JCR\":\"Q3\",\"JCRName\":\"BIOCHEMICAL RESEARCH METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Biology Methods and Protocols","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/biomethods/bpae028","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/1/1 0:00:00","PubModel":"eCollection","JCR":"Q3","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
Early detection and diagnosis of cancer with interpretable machine learning to uncover cancer-specific DNA methylation patterns.
Cancer, a collection of more than two hundred different diseases, remains a leading cause of morbidity and mortality worldwide. Usually detected at the advanced stages of disease, metastatic cancer accounts for 90% of cancer-associated deaths. Therefore, the early detection of cancer, combined with current therapies, would have a significant impact on survival and treatment of various cancer types. Epigenetic changes such as DNA methylation are some of the early events underlying carcinogenesis. Here, we report on an interpretable machine learning model that can classify 13 cancer types as well as non-cancer tissue samples using only DNA methylome data, with 98.2% accuracy. We utilize the features identified by this model to develop EMethylNET, a robust model consisting of an XGBoost model that provides information to a deep neural network that can generalize to independent data sets. We also demonstrate that the methylation-associated genomic loci detected by the classifier are associated with genes, pathways and networks involved in cancer, providing insights into the epigenomic regulation of carcinogenesis.