Single-fly genome assemblies fill major phylogenomic gaps across the Drosophilidae Tree of Life.

IF 9.8 1区 生物学 Q1 Agricultural and Biological Sciences PLoS Biology Pub Date : 2024-07-18 eCollection Date: 2024-07-01 DOI:10.1371/journal.pbio.3002697
Bernard Y Kim, Hannah R Gellert, Samuel H Church, Anton Suvorov, Sean S Anderson, Olga Barmina, Sofia G Beskid, Aaron A Comeault, K Nicole Crown, Sarah E Diamond, Steve Dorus, Takako Fujichika, James A Hemker, Jan Hrcek, Maaria Kankare, Toru Katoh, Karl N Magnacca, Ryan A Martin, Teruyuki Matsunaga, Matthew J Medeiros, Danny E Miller, Scott Pitnick, Michele Schiffer, Sara Simoni, Tessa E Steenwinkel, Zeeshan A Syed, Aya Takahashi, Kevin H-C Wei, Tsuya Yokoyama, Michael B Eisen, Artyom Kopp, Daniel Matute, Darren J Obbard, Patrick M O'Grady, Donald K Price, Masanori J Toda, Thomas Werner, Dmitri A Petrov
{"title":"Single-fly genome assemblies fill major phylogenomic gaps across the Drosophilidae Tree of Life.","authors":"Bernard Y Kim, Hannah R Gellert, Samuel H Church, Anton Suvorov, Sean S Anderson, Olga Barmina, Sofia G Beskid, Aaron A Comeault, K Nicole Crown, Sarah E Diamond, Steve Dorus, Takako Fujichika, James A Hemker, Jan Hrcek, Maaria Kankare, Toru Katoh, Karl N Magnacca, Ryan A Martin, Teruyuki Matsunaga, Matthew J Medeiros, Danny E Miller, Scott Pitnick, Michele Schiffer, Sara Simoni, Tessa E Steenwinkel, Zeeshan A Syed, Aya Takahashi, Kevin H-C Wei, Tsuya Yokoyama, Michael B Eisen, Artyom Kopp, Daniel Matute, Darren J Obbard, Patrick M O'Grady, Donald K Price, Masanori J Toda, Thomas Werner, Dmitri A Petrov","doi":"10.1371/journal.pbio.3002697","DOIUrl":null,"url":null,"abstract":"<p><p>Long-read sequencing is driving rapid progress in genome assembly across all major groups of life, including species of the family Drosophilidae, a longtime model system for genetics, genomics, and evolution. We previously developed a cost-effective hybrid Oxford Nanopore (ONT) long-read and Illumina short-read sequencing approach and used it to assemble 101 drosophilid genomes from laboratory cultures, greatly increasing the number of genome assemblies for this taxonomic group. The next major challenge is to address the laboratory culture bias in taxon sampling by sequencing genomes of species that cannot easily be reared in the lab. Here, we build upon our previous methods to perform amplification-free ONT sequencing of single wild flies obtained either directly from the field or from ethanol-preserved specimens in museum collections, greatly improving the representation of lesser studied drosophilid taxa in whole-genome data. Using Illumina Novaseq X Plus and ONT P2 sequencers with R10.4.1 chemistry, we set a new benchmark for inexpensive hybrid genome assembly at US $150 per genome while assembling genomes from as little as 35 ng of genomic DNA from a single fly. We present 183 new genome assemblies for 179 species as a resource for drosophilid systematics, phylogenetics, and comparative genomics. Of these genomes, 62 are from pooled lab strains and 121 from single adult flies. Despite the sample limitations of working with small insects, most single-fly diploid assemblies are comparable in contiguity (>1 Mb contig N50), completeness (>98% complete dipteran BUSCOs), and accuracy (>QV40 genome-wide with ONT R10.4.1) to assemblies from inbred lines. We present a well-resolved multi-locus phylogeny for 360 drosophilid and 4 outgroup species encompassing all publicly available (as of August 2023) genomes for this group. Finally, we present a Progressive Cactus whole-genome, reference-free alignment built from a subset of 298 suitably high-quality drosophilid genomes. The new assemblies and alignment, along with updated laboratory protocols and computational pipelines, are released as an open resource and as a tool for studying evolution at the scale of an entire insect family.</p>","PeriodicalId":49001,"journal":{"name":"PLoS Biology","volume":null,"pages":null},"PeriodicalIF":9.8000,"publicationDate":"2024-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11257246/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"PLoS Biology","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1371/journal.pbio.3002697","RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/7/1 0:00:00","PubModel":"eCollection","JCR":"Q1","JCRName":"Agricultural and Biological Sciences","Score":null,"Total":0}
引用次数: 0

Abstract

Long-read sequencing is driving rapid progress in genome assembly across all major groups of life, including species of the family Drosophilidae, a longtime model system for genetics, genomics, and evolution. We previously developed a cost-effective hybrid Oxford Nanopore (ONT) long-read and Illumina short-read sequencing approach and used it to assemble 101 drosophilid genomes from laboratory cultures, greatly increasing the number of genome assemblies for this taxonomic group. The next major challenge is to address the laboratory culture bias in taxon sampling by sequencing genomes of species that cannot easily be reared in the lab. Here, we build upon our previous methods to perform amplification-free ONT sequencing of single wild flies obtained either directly from the field or from ethanol-preserved specimens in museum collections, greatly improving the representation of lesser studied drosophilid taxa in whole-genome data. Using Illumina Novaseq X Plus and ONT P2 sequencers with R10.4.1 chemistry, we set a new benchmark for inexpensive hybrid genome assembly at US $150 per genome while assembling genomes from as little as 35 ng of genomic DNA from a single fly. We present 183 new genome assemblies for 179 species as a resource for drosophilid systematics, phylogenetics, and comparative genomics. Of these genomes, 62 are from pooled lab strains and 121 from single adult flies. Despite the sample limitations of working with small insects, most single-fly diploid assemblies are comparable in contiguity (>1 Mb contig N50), completeness (>98% complete dipteran BUSCOs), and accuracy (>QV40 genome-wide with ONT R10.4.1) to assemblies from inbred lines. We present a well-resolved multi-locus phylogeny for 360 drosophilid and 4 outgroup species encompassing all publicly available (as of August 2023) genomes for this group. Finally, we present a Progressive Cactus whole-genome, reference-free alignment built from a subset of 298 suitably high-quality drosophilid genomes. The new assemblies and alignment, along with updated laboratory protocols and computational pipelines, are released as an open resource and as a tool for studying evolution at the scale of an entire insect family.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
单蝇基因组组装填补了整个果蝇科生命树的主要系统发生学空白。
长读程测序技术正在推动所有主要生命群体基因组组装的快速进展,包括长期以来作为遗传学、基因组学和进化模型系统的果蝇科物种。我们之前开发了一种具有成本效益的牛津纳米孔(ONT)长读数和Illumina短读数混合测序方法,并用它从实验室培养物中组装了101个果蝇基因组,大大增加了该分类群的基因组组装数量。下一个主要挑战是通过对实验室不易饲养的物种进行基因组测序,解决分类群取样中的实验室培养偏差问题。在此,我们在先前方法的基础上,对直接从野外或从博物馆收藏的乙醇保存标本中获得的单个野生苍蝇进行了无扩增 ONT 测序,大大提高了研究较少的果蝇类群在全基因组数据中的代表性。我们使用配备 R10.4.1 化学试剂的 Illumina Novaseq X Plus 和 ONT P2 测序仪,以每个基因组 150 美元的价格为廉价的混合基因组组装设定了新的基准,同时只需从单个苍蝇的 35 纳克基因组 DNA 中组装基因组。我们为 179 个物种提供了 183 个新的基因组,作为嗜蝇类系统学、系统发生学和比较基因组学的资源。在这些基因组中,62 个来自实验室的集合菌株,121 个来自单个成蝇。尽管小型昆虫的样本有限,但大多数单蝇二倍体基因组在连续性(>1 Mb contig N50)、完整性(>98% 完整的双翅目 BUSCOs)和准确性(使用 ONT R10.4.1 时,全基因组>QV40)方面与近交系的基因组相当。我们提出了 360 个嗜双翅目物种和 4 个外群物种的解析度较高的多焦点系统发生,涵盖了该类群所有公开可用(截至 2023 年 8 月)的基因组。最后,我们介绍了从 298 个高质量的嗜酸果蝇基因组中挑选出来的 Progressive Cactus 全基因组无参考文献比对。新的组装和比对以及更新的实验室协议和计算管道将作为开放资源和研究整个昆虫家族进化的工具发布。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
PLoS Biology
PLoS Biology BIOCHEMISTRY & MOLECULAR BIOLOGY-BIOLOGY
CiteScore
15.40
自引率
2.00%
发文量
359
审稿时长
3-8 weeks
期刊介绍: PLOS Biology is the flagship journal of the Public Library of Science (PLOS) and focuses on publishing groundbreaking and relevant research in all areas of biological science. The journal features works at various scales, ranging from molecules to ecosystems, and also encourages interdisciplinary studies. PLOS Biology publishes articles that demonstrate exceptional significance, originality, and relevance, with a high standard of scientific rigor in methodology, reporting, and conclusions. The journal aims to advance science and serve the research community by transforming research communication to align with the research process. It offers evolving article types and policies that empower authors to share the complete story behind their scientific findings with a diverse global audience of researchers, educators, policymakers, patient advocacy groups, and the general public. PLOS Biology, along with other PLOS journals, is widely indexed by major services such as Crossref, Dimensions, DOAJ, Google Scholar, PubMed, PubMed Central, Scopus, and Web of Science. Additionally, PLOS Biology is indexed by various other services including AGRICOLA, Biological Abstracts, BIOSYS Previews, CABI CAB Abstracts, CABI Global Health, CAPES, CAS, CNKI, Embase, Journal Guide, MEDLINE, and Zoological Record, ensuring that the research content is easily accessible and discoverable by a wide range of audiences.
期刊最新文献
LXR-dependent enhancer activation regulates the temporal organization of the liver's response to refeeding leading to lipogenic gene overshoot. Syllable processing is organized in discrete subregions of the human superior temporal gyrus. Centrosome amplification primes ovarian cancer cells for apoptosis and potentiates the response to chemotherapy. Keeping time: How musical training may boost cognition. A flexible loop in the paxillin LIM3 domain mediates its direct binding to integrin β subunits.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1