{"title":"基于机器学习方法的切除组织肺微生物组分类非小细胞肺癌亚型。","authors":"Pragya Kashyap, Kalbhavi Vadhi Raj, Jyoti Sharma, Naveen Dutt, Pankaj Yadav","doi":"10.1038/s41540-025-00491-4","DOIUrl":null,"url":null,"abstract":"<p><p>Classification of adenocarcinoma (AC) and squamous cell carcinoma (SCC) poses significant challenges for cytopathologists, often necessitating clinical tests and biopsies that delay treatment initiation. To address this, we developed a machine learning-based approach utilizing resected lung-tissue microbiome of AC and SCC patients for subtype classification. Differentially enriched taxa were identified using LEfSe, revealing ten potential microbial markers. Linear discriminant analysis (LDA) was subsequently applied to enhance inter-class separability. Next, benchmarking was performed across six different supervised-classification algorithms viz. logistic-regression, naïve-bayes, random-forest, extreme-gradient-boost (XGBoost), k-nearest neighbor, and deep neural network. Noteworthy, XGBoost, with an accuracy of 76.25%, and AUROC (area-under-receiver-operating-characteristic) of 0.81 with 69% specificity and 76% sensitivity, outperform the other five classification algorithms using LDA-transformed features. Validation on an independent dataset confirmed its robustness with an AUROC of 0.71, with minimal false positives and negatives. This study is the first to classify AC and SCC subtypes using lung-tissue microbiome.</p>","PeriodicalId":19345,"journal":{"name":"NPJ Systems Biology and Applications","volume":"11 1","pages":"11"},"PeriodicalIF":3.5000,"publicationDate":"2025-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11742043/pdf/","citationCount":"0","resultStr":"{\"title\":\"Classification of NSCLC subtypes using lung microbiome from resected tissue based on machine learning methods.\",\"authors\":\"Pragya Kashyap, Kalbhavi Vadhi Raj, Jyoti Sharma, Naveen Dutt, Pankaj Yadav\",\"doi\":\"10.1038/s41540-025-00491-4\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Classification of adenocarcinoma (AC) and squamous cell carcinoma (SCC) poses significant challenges for cytopathologists, often necessitating clinical tests and biopsies that delay treatment initiation. To address this, we developed a machine learning-based approach utilizing resected lung-tissue microbiome of AC and SCC patients for subtype classification. Differentially enriched taxa were identified using LEfSe, revealing ten potential microbial markers. Linear discriminant analysis (LDA) was subsequently applied to enhance inter-class separability. Next, benchmarking was performed across six different supervised-classification algorithms viz. logistic-regression, naïve-bayes, random-forest, extreme-gradient-boost (XGBoost), k-nearest neighbor, and deep neural network. Noteworthy, XGBoost, with an accuracy of 76.25%, and AUROC (area-under-receiver-operating-characteristic) of 0.81 with 69% specificity and 76% sensitivity, outperform the other five classification algorithms using LDA-transformed features. Validation on an independent dataset confirmed its robustness with an AUROC of 0.71, with minimal false positives and negatives. This study is the first to classify AC and SCC subtypes using lung-tissue microbiome.</p>\",\"PeriodicalId\":19345,\"journal\":{\"name\":\"NPJ Systems Biology and Applications\",\"volume\":\"11 1\",\"pages\":\"11\"},\"PeriodicalIF\":3.5000,\"publicationDate\":\"2025-01-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11742043/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"NPJ Systems Biology and Applications\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.1038/s41540-025-00491-4\",\"RegionNum\":2,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"MATHEMATICAL & COMPUTATIONAL BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"NPJ Systems Biology and Applications","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1038/s41540-025-00491-4","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MATHEMATICAL & COMPUTATIONAL BIOLOGY","Score":null,"Total":0}
Classification of NSCLC subtypes using lung microbiome from resected tissue based on machine learning methods.
Classification of adenocarcinoma (AC) and squamous cell carcinoma (SCC) poses significant challenges for cytopathologists, often necessitating clinical tests and biopsies that delay treatment initiation. To address this, we developed a machine learning-based approach utilizing resected lung-tissue microbiome of AC and SCC patients for subtype classification. Differentially enriched taxa were identified using LEfSe, revealing ten potential microbial markers. Linear discriminant analysis (LDA) was subsequently applied to enhance inter-class separability. Next, benchmarking was performed across six different supervised-classification algorithms viz. logistic-regression, naïve-bayes, random-forest, extreme-gradient-boost (XGBoost), k-nearest neighbor, and deep neural network. Noteworthy, XGBoost, with an accuracy of 76.25%, and AUROC (area-under-receiver-operating-characteristic) of 0.81 with 69% specificity and 76% sensitivity, outperform the other five classification algorithms using LDA-transformed features. Validation on an independent dataset confirmed its robustness with an AUROC of 0.71, with minimal false positives and negatives. This study is the first to classify AC and SCC subtypes using lung-tissue microbiome.
期刊介绍:
npj Systems Biology and Applications is an online Open Access journal dedicated to publishing the premier research that takes a systems-oriented approach. The journal aims to provide a forum for the presentation of articles that help define this nascent field, as well as those that apply the advances to wider fields. We encourage studies that integrate, or aid the integration of, data, analyses and insight from molecules to organisms and broader systems. Important areas of interest include not only fundamental biological systems and drug discovery, but also applications to health, medical practice and implementation, big data, biotechnology, food science, human behaviour, broader biological systems and industrial applications of systems biology.
We encourage all approaches, including network biology, application of control theory to biological systems, computational modelling and analysis, comprehensive and/or high-content measurements, theoretical, analytical and computational studies of system-level properties of biological systems and computational/software/data platforms enabling such studies.