{"title":"ARGai 1.0:一种利用视觉变压器识别大肠杆菌耐药基因和菌株的GAN增强硅方法。","authors":"Debasish Swapnesh Kumar Nayak , Ruchika Das , Santanu Kumar Sahoo , Tripti Swarnkar","doi":"10.1016/j.compbiolchem.2025.108342","DOIUrl":null,"url":null,"abstract":"<div><div>The emergence of infectious disease and antibiotic resistance in bacteria like Escherichia coli (<em>E. coli</em>) shows the necessity for novel computational techniques for identifying essential genes that contribute to resistance. The task of identifying resistant strains and multi-drug patterns in <em>E. coli</em> is a major challenge with whole genome sequencing (WGS) and next-generation sequencing (NGS) data. To address this issue, we suggest <strong>ARGai</strong> 1.0 a deep learning architecture enhanced with generative adversarial networks (GANs). We mitigate data scarcity difficulties by augmenting limited experimental datasets with synthetic data generated by GANs. Our in-silico method (augmentation with feature selection) improves the identification of resistance genes in <em>E. coli</em> by using feature extraction techniques to identify valuable features from actual and GAN-generated data. Employing comprehensive validation, we exhibit the effectiveness of our <strong>ARGai</strong> 1.0 in precisely identifying the informative and resistant genes. In addition, our <strong>ARGai</strong> 1.0 identifies the resistant strains with a classification accuracy of 98.96 % on Deep Convolutional Generative Adversarial Network (DCGAN) augmented data. Additionally, <strong>ARGai</strong> 1.0 achieves more than 98 % of sensitivity and specificity. We also benchmark our ARGai 1.0 with several state-of-the-art AI models for resistant strain classification. In the fight against antibiotic resistance, <strong>ARGai</strong> 1.0 offers a promising avenue for computational genomics. With implications for research and clinical practice, this work shows the potential of deep networks with GAN augmentation as a practical and successful method for gene identification in <em>E. coli</em>.</div></div>","PeriodicalId":10616,"journal":{"name":"Computational Biology and Chemistry","volume":"115 ","pages":"Article 108342"},"PeriodicalIF":2.6000,"publicationDate":"2025-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"ARGai 1.0: A GAN augmented in silico approach for identifying resistant genes and strains in E. coli using vision transformer\",\"authors\":\"Debasish Swapnesh Kumar Nayak , Ruchika Das , Santanu Kumar Sahoo , Tripti Swarnkar\",\"doi\":\"10.1016/j.compbiolchem.2025.108342\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>The emergence of infectious disease and antibiotic resistance in bacteria like Escherichia coli (<em>E. coli</em>) shows the necessity for novel computational techniques for identifying essential genes that contribute to resistance. The task of identifying resistant strains and multi-drug patterns in <em>E. coli</em> is a major challenge with whole genome sequencing (WGS) and next-generation sequencing (NGS) data. To address this issue, we suggest <strong>ARGai</strong> 1.0 a deep learning architecture enhanced with generative adversarial networks (GANs). We mitigate data scarcity difficulties by augmenting limited experimental datasets with synthetic data generated by GANs. Our in-silico method (augmentation with feature selection) improves the identification of resistance genes in <em>E. coli</em> by using feature extraction techniques to identify valuable features from actual and GAN-generated data. Employing comprehensive validation, we exhibit the effectiveness of our <strong>ARGai</strong> 1.0 in precisely identifying the informative and resistant genes. In addition, our <strong>ARGai</strong> 1.0 identifies the resistant strains with a classification accuracy of 98.96 % on Deep Convolutional Generative Adversarial Network (DCGAN) augmented data. Additionally, <strong>ARGai</strong> 1.0 achieves more than 98 % of sensitivity and specificity. We also benchmark our ARGai 1.0 with several state-of-the-art AI models for resistant strain classification. In the fight against antibiotic resistance, <strong>ARGai</strong> 1.0 offers a promising avenue for computational genomics. With implications for research and clinical practice, this work shows the potential of deep networks with GAN augmentation as a practical and successful method for gene identification in <em>E. coli</em>.</div></div>\",\"PeriodicalId\":10616,\"journal\":{\"name\":\"Computational Biology and Chemistry\",\"volume\":\"115 \",\"pages\":\"Article 108342\"},\"PeriodicalIF\":2.6000,\"publicationDate\":\"2025-01-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computational Biology and Chemistry\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1476927125000027\",\"RegionNum\":4,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computational Biology and Chemistry","FirstCategoryId":"99","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1476927125000027","RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"BIOLOGY","Score":null,"Total":0}
ARGai 1.0: A GAN augmented in silico approach for identifying resistant genes and strains in E. coli using vision transformer
The emergence of infectious disease and antibiotic resistance in bacteria like Escherichia coli (E. coli) shows the necessity for novel computational techniques for identifying essential genes that contribute to resistance. The task of identifying resistant strains and multi-drug patterns in E. coli is a major challenge with whole genome sequencing (WGS) and next-generation sequencing (NGS) data. To address this issue, we suggest ARGai 1.0 a deep learning architecture enhanced with generative adversarial networks (GANs). We mitigate data scarcity difficulties by augmenting limited experimental datasets with synthetic data generated by GANs. Our in-silico method (augmentation with feature selection) improves the identification of resistance genes in E. coli by using feature extraction techniques to identify valuable features from actual and GAN-generated data. Employing comprehensive validation, we exhibit the effectiveness of our ARGai 1.0 in precisely identifying the informative and resistant genes. In addition, our ARGai 1.0 identifies the resistant strains with a classification accuracy of 98.96 % on Deep Convolutional Generative Adversarial Network (DCGAN) augmented data. Additionally, ARGai 1.0 achieves more than 98 % of sensitivity and specificity. We also benchmark our ARGai 1.0 with several state-of-the-art AI models for resistant strain classification. In the fight against antibiotic resistance, ARGai 1.0 offers a promising avenue for computational genomics. With implications for research and clinical practice, this work shows the potential of deep networks with GAN augmentation as a practical and successful method for gene identification in E. coli.
期刊介绍:
Computational Biology and Chemistry publishes original research papers and review articles in all areas of computational life sciences. High quality research contributions with a major computational component in the areas of nucleic acid and protein sequence research, molecular evolution, molecular genetics (functional genomics and proteomics), theory and practice of either biology-specific or chemical-biology-specific modeling, and structural biology of nucleic acids and proteins are particularly welcome. Exceptionally high quality research work in bioinformatics, systems biology, ecology, computational pharmacology, metabolism, biomedical engineering, epidemiology, and statistical genetics will also be considered.
Given their inherent uncertainty, protein modeling and molecular docking studies should be thoroughly validated. In the absence of experimental results for validation, the use of molecular dynamics simulations along with detailed free energy calculations, for example, should be used as complementary techniques to support the major conclusions. Submissions of premature modeling exercises without additional biological insights will not be considered.
Review articles will generally be commissioned by the editors and should not be submitted to the journal without explicit invitation. However prospective authors are welcome to send a brief (one to three pages) synopsis, which will be evaluated by the editors.