{"title":"Genome assembly in the telomere-to-telomere era","authors":"Heng Li, Richard Durbin","doi":"10.1038/s41576-024-00718-w","DOIUrl":null,"url":null,"abstract":"Genome sequences largely determine the biology and encode the history of an organism, and de novo assembly — the process of reconstructing the genome sequence of an organism from sequencing reads — has been a central problem in bioinformatics for four decades. Until recently, genomes were typically assembled into fragments of a few megabases at best, but now technological advances in long-read sequencing enable the near-complete assembly of each chromosome — also known as telomere-to-telomere assembly — for many organisms. Here, we review recent progress on assembly algorithms and protocols, with a focus on how to derive near-telomere-to-telomere assemblies. We also discuss the additional developments that will be required to resolve remaining assembly gaps and to assemble non-diploid genomes. In this Review, Li and Durbin discuss how to generate telomere-to-telomere assemblies for large haploid or diploid genomes using currently available data types and algorithms, and outline remaining challenges in resolving highly repetitive sequences and polyploid genomes.","PeriodicalId":19067,"journal":{"name":"Nature Reviews Genetics","volume":null,"pages":null},"PeriodicalIF":39.1000,"publicationDate":"2024-04-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Nature Reviews Genetics","FirstCategoryId":"99","ListUrlMain":"https://www.nature.com/articles/s41576-024-00718-w","RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"GENETICS & HEREDITY","Score":null,"Total":0}
引用次数: 0
Abstract
Genome sequences largely determine the biology and encode the history of an organism, and de novo assembly — the process of reconstructing the genome sequence of an organism from sequencing reads — has been a central problem in bioinformatics for four decades. Until recently, genomes were typically assembled into fragments of a few megabases at best, but now technological advances in long-read sequencing enable the near-complete assembly of each chromosome — also known as telomere-to-telomere assembly — for many organisms. Here, we review recent progress on assembly algorithms and protocols, with a focus on how to derive near-telomere-to-telomere assemblies. We also discuss the additional developments that will be required to resolve remaining assembly gaps and to assemble non-diploid genomes. In this Review, Li and Durbin discuss how to generate telomere-to-telomere assemblies for large haploid or diploid genomes using currently available data types and algorithms, and outline remaining challenges in resolving highly repetitive sequences and polyploid genomes.
期刊介绍:
At Nature Reviews Genetics, our goal is to be the leading source of reviews and commentaries for the scientific communities we serve. We are dedicated to publishing authoritative articles that are easily accessible to our readers. We believe in enhancing our articles with clear and understandable figures, tables, and other display items. Our aim is to provide an unparalleled service to authors, referees, and readers, and we are committed to maximizing the usefulness and impact of each article we publish.
Within our journal, we publish a range of content including Research Highlights, Comments, Reviews, and Perspectives that are relevant to geneticists and genomicists. With our broad scope, we ensure that the articles we publish reach the widest possible audience.
As part of the Nature Reviews portfolio of journals, we strive to uphold the high standards and reputation associated with this esteemed collection of publications.