ARGai 1.0: A GAN augmented in silico approach for identifying resistant genes and strains in E. coli using vision transformer

IF 2.6 4区 生物学 Q2 BIOLOGY Computational Biology and Chemistry Pub Date : 2025-01-07 DOI:10.1016/j.compbiolchem.2025.108342
Debasish Swapnesh Kumar Nayak , Ruchika Das , Santanu Kumar Sahoo , Tripti Swarnkar
{"title":"ARGai 1.0: A GAN augmented in silico approach for identifying resistant genes and strains in E. coli using vision transformer","authors":"Debasish Swapnesh Kumar Nayak ,&nbsp;Ruchika Das ,&nbsp;Santanu Kumar Sahoo ,&nbsp;Tripti Swarnkar","doi":"10.1016/j.compbiolchem.2025.108342","DOIUrl":null,"url":null,"abstract":"<div><div>The emergence of infectious disease and antibiotic resistance in bacteria like Escherichia coli (<em>E. coli</em>) shows the necessity for novel computational techniques for identifying essential genes that contribute to resistance. The task of identifying resistant strains and multi-drug patterns in <em>E. coli</em> is a major challenge with whole genome sequencing (WGS) and next-generation sequencing (NGS) data. To address this issue, we suggest <strong>ARGai</strong> 1.0 a deep learning architecture enhanced with generative adversarial networks (GANs). We mitigate data scarcity difficulties by augmenting limited experimental datasets with synthetic data generated by GANs. Our in-silico method (augmentation with feature selection) improves the identification of resistance genes in <em>E. coli</em> by using feature extraction techniques to identify valuable features from actual and GAN-generated data. Employing comprehensive validation, we exhibit the effectiveness of our <strong>ARGai</strong> 1.0 in precisely identifying the informative and resistant genes. In addition, our <strong>ARGai</strong> 1.0 identifies the resistant strains with a classification accuracy of 98.96 % on Deep Convolutional Generative Adversarial Network (DCGAN) augmented data. Additionally, <strong>ARGai</strong> 1.0 achieves more than 98 % of sensitivity and specificity. We also benchmark our ARGai 1.0 with several state-of-the-art AI models for resistant strain classification. In the fight against antibiotic resistance, <strong>ARGai</strong> 1.0 offers a promising avenue for computational genomics. With implications for research and clinical practice, this work shows the potential of deep networks with GAN augmentation as a practical and successful method for gene identification in <em>E. coli</em>.</div></div>","PeriodicalId":10616,"journal":{"name":"Computational Biology and Chemistry","volume":"115 ","pages":"Article 108342"},"PeriodicalIF":2.6000,"publicationDate":"2025-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computational Biology and Chemistry","FirstCategoryId":"99","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1476927125000027","RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"BIOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

The emergence of infectious disease and antibiotic resistance in bacteria like Escherichia coli (E. coli) shows the necessity for novel computational techniques for identifying essential genes that contribute to resistance. The task of identifying resistant strains and multi-drug patterns in E. coli is a major challenge with whole genome sequencing (WGS) and next-generation sequencing (NGS) data. To address this issue, we suggest ARGai 1.0 a deep learning architecture enhanced with generative adversarial networks (GANs). We mitigate data scarcity difficulties by augmenting limited experimental datasets with synthetic data generated by GANs. Our in-silico method (augmentation with feature selection) improves the identification of resistance genes in E. coli by using feature extraction techniques to identify valuable features from actual and GAN-generated data. Employing comprehensive validation, we exhibit the effectiveness of our ARGai 1.0 in precisely identifying the informative and resistant genes. In addition, our ARGai 1.0 identifies the resistant strains with a classification accuracy of 98.96 % on Deep Convolutional Generative Adversarial Network (DCGAN) augmented data. Additionally, ARGai 1.0 achieves more than 98 % of sensitivity and specificity. We also benchmark our ARGai 1.0 with several state-of-the-art AI models for resistant strain classification. In the fight against antibiotic resistance, ARGai 1.0 offers a promising avenue for computational genomics. With implications for research and clinical practice, this work shows the potential of deep networks with GAN augmentation as a practical and successful method for gene identification in E. coli.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
ARGai 1.0:一种利用视觉变压器识别大肠杆菌耐药基因和菌株的GAN增强硅方法。
大肠杆菌(E. coli)等细菌中出现的传染病和抗生素耐药性表明,需要新的计算技术来识别导致耐药性的基本基因。在大肠杆菌中鉴定耐药菌株和多药模式是全基因组测序(WGS)和下一代测序(NGS)数据的主要挑战。为了解决这个问题,我们建议使用生成对抗网络(gan)增强的深度学习架构ARGai 1.0。我们通过用gan生成的合成数据增加有限的实验数据集来缓解数据稀缺的困难。我们的计算机方法(增强特征选择)通过使用特征提取技术从实际和gan生成的数据中识别有价值的特征,提高了大肠杆菌耐药基因的鉴定。通过综合验证,我们展示了ARGai 1.0在精确识别信息基因和抗性基因方面的有效性。此外,我们的ARGai 1.0在深度卷积生成对抗网络(DCGAN)增强数据上识别耐药菌株的分类准确率为98.96 %。此外,ARGai 1.0的灵敏度和特异度均达到98% %以上。我们还使用几种最先进的AI模型对ARGai 1.0进行基准测试,以进行抗性菌株分类。在对抗抗生素耐药性的斗争中,ARGai 1.0为计算基因组学提供了一条有前途的途径。对于研究和临床实践,这项工作显示了GAN增强的深度网络作为大肠杆菌基因鉴定的实用和成功方法的潜力。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Computational Biology and Chemistry
Computational Biology and Chemistry 生物-计算机:跨学科应用
CiteScore
6.10
自引率
3.20%
发文量
142
审稿时长
24 days
期刊介绍: Computational Biology and Chemistry publishes original research papers and review articles in all areas of computational life sciences. High quality research contributions with a major computational component in the areas of nucleic acid and protein sequence research, molecular evolution, molecular genetics (functional genomics and proteomics), theory and practice of either biology-specific or chemical-biology-specific modeling, and structural biology of nucleic acids and proteins are particularly welcome. Exceptionally high quality research work in bioinformatics, systems biology, ecology, computational pharmacology, metabolism, biomedical engineering, epidemiology, and statistical genetics will also be considered. Given their inherent uncertainty, protein modeling and molecular docking studies should be thoroughly validated. In the absence of experimental results for validation, the use of molecular dynamics simulations along with detailed free energy calculations, for example, should be used as complementary techniques to support the major conclusions. Submissions of premature modeling exercises without additional biological insights will not be considered. Review articles will generally be commissioned by the editors and should not be submitted to the journal without explicit invitation. However prospective authors are welcome to send a brief (one to three pages) synopsis, which will be evaluated by the editors.
期刊最新文献
Decoding the link between microbial secondary metabolites and colorectal cancer Virtual screening of polyherbal compounds for AKT1 and HSPB1 inhibition in breast cancer apoptosis pathway Editorial Board Synergistic modeling of hemorrhagic dengue fever: Passive immunity dynamics and time-delay neural network analysis Knowledge graph applications and multi-relation learning for drug repurposing: A scoping review
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1