Nucleotide-resolution DNA foundation models of prokaryotic genomes

IF 29 1区 生物学 Q1 GENETICS & HEREDITY Nature genetics Pub Date : 2025-01-15 DOI:10.1038/s41588-024-02062-5
Michael Fletcher
{"title":"Nucleotide-resolution DNA foundation models of prokaryotic genomes","authors":"Michael Fletcher","doi":"10.1038/s41588-024-02062-5","DOIUrl":null,"url":null,"abstract":"","PeriodicalId":18985,"journal":{"name":"Nature genetics","volume":"57 1","pages":"2-2"},"PeriodicalIF":29.0000,"publicationDate":"2025-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Nature genetics","FirstCategoryId":"99","ListUrlMain":"https://www.nature.com/articles/s41588-024-02062-5","RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"GENETICS & HEREDITY","Score":null,"Total":0}
引用次数: 0
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
原核生物基因组的核苷酸分辨率DNA基础模型
最近,机器学习模型因其在包括基因组学在内的广泛领域应用的潜力而引起了人们的兴奋。然而,由于它们的复杂性,在DNA语言模型的大基因组背景下训练它们是非常昂贵的,导致有限的接受域和/或n-mer序列标记化。Nguyen等人用Evo向该领域迈出了一步,Evo是一个基础模型,它应用了高效的杂交条纹鬣狗架构,在单核苷酸分辨率下训练了80,000个原核生物和数百万个噬菌体和质粒序列。在基准测试中,Evo在变异适应度、启动子活性和蛋白质表达预测方面显示出与最先进的核苷酸和语言模型相当或改进的性能。令人印象深刻的是,Evo可用于生成新的,实验验证的CRISPR-Cas和转座子系统,并通过过早停止密码子插入预测基因的必要性;它也显示出产生合成全基因组的一些希望。适用于许多任务的基础DNA语言模型将具有广泛的实用性,而Evo强调了它们的前景。然而,必须注意的是,与真核生物的基因组相比,训练数据集很小,并且仍然有限的131-kb上下文和下一个标记预测将需要进一步适应多细胞生命日益增加的复杂性,这表明仍有很多工作要做。原始参考文献:Science 386, eado9336 (2024)
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Nature genetics
Nature genetics 生物-遗传学
CiteScore
43.00
自引率
2.60%
发文量
241
审稿时长
3 months
期刊介绍: Nature Genetics publishes the very highest quality research in genetics. It encompasses genetic and functional genomic studies on human and plant traits and on other model organisms. Current emphasis is on the genetic basis for common and complex diseases and on the functional mechanism, architecture and evolution of gene networks, studied by experimental perturbation. Integrative genetic topics comprise, but are not limited to: -Genes in the pathology of human disease -Molecular analysis of simple and complex genetic traits -Cancer genetics -Agricultural genomics -Developmental genetics -Regulatory variation in gene expression -Strategies and technologies for extracting function from genomic data -Pharmacological genomics -Genome evolution
期刊最新文献
Strand-seq and the future of personalized genomics Single-cell spatial transcriptomic analysis of human skin anatomy A refined blueprint for human skin. Publisher Correction: A genetic module boosts grain yield and nitrogen use efficiency by improving nitrate transport in maize. Author Correction: Allelic variation at a single locus distinguishes spring and winter faba beans.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1