GenomeDecoder: inferring segmental duplications in highly repetitive genomic regions.

Zhenmiao Zhang, Ishaan Gupta, Pavel A Pevzner
{"title":"GenomeDecoder: inferring segmental duplications in highly repetitive genomic regions.","authors":"Zhenmiao Zhang, Ishaan Gupta, Pavel A Pevzner","doi":"10.1093/bioinformatics/btaf058","DOIUrl":null,"url":null,"abstract":"<p><strong>Motivation: </strong>The emergence of the 'telomere-to-telomere' genomics brought the challenge of identifying segmental duplications (SDs) in complete genomes. It further opened a possibility for identifying the differences in SDs across individual human genomes and studying the SD evolution. These newly emerged challenges require algorithms for reconstructing SDs in the most complex genomic regions that evaded all previous attempts to analyze their architecture, such as rapidly evolving immunoglobulin loci.</p><p><strong>Results: </strong>We describe the GenomeDecoder algorithm for inferring SDs and apply it to analyzing genomic architectures of various loci in primate genomes. Our analysis revealed that multiple duplications/deletions led to a rapid birth/death of immunoglobulin genes within the human population and large changes in genomic architecture of immunoglobulin loci across primate genomes. Comparison of immunoglobulin loci across primate genomes suggests that they are subjected to diversifying selection.</p><p><strong>Availability and implementation: </strong>GenomeDecoder is available at https://github.com/ZhangZhenmiao/GenomeDecoder. The software version and test data used in this paper are uploaded to https://doi.org/10.5281/zenodo.14753844.</p>","PeriodicalId":93899,"journal":{"name":"Bioinformatics (Oxford, England)","volume":" ","pages":""},"PeriodicalIF":5.4000,"publicationDate":"2025-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11842051/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Bioinformatics (Oxford, England)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/bioinformatics/btaf058","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Motivation: The emergence of the 'telomere-to-telomere' genomics brought the challenge of identifying segmental duplications (SDs) in complete genomes. It further opened a possibility for identifying the differences in SDs across individual human genomes and studying the SD evolution. These newly emerged challenges require algorithms for reconstructing SDs in the most complex genomic regions that evaded all previous attempts to analyze their architecture, such as rapidly evolving immunoglobulin loci.

Results: We describe the GenomeDecoder algorithm for inferring SDs and apply it to analyzing genomic architectures of various loci in primate genomes. Our analysis revealed that multiple duplications/deletions led to a rapid birth/death of immunoglobulin genes within the human population and large changes in genomic architecture of immunoglobulin loci across primate genomes. Comparison of immunoglobulin loci across primate genomes suggests that they are subjected to diversifying selection.

Availability and implementation: GenomeDecoder is available at https://github.com/ZhangZhenmiao/GenomeDecoder. The software version and test data used in this paper are uploaded to https://doi.org/10.5281/zenodo.14753844.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基因组解码器:推断高重复基因组区域的片段重复。
动机:“端粒到端粒”基因组学的出现带来了在完整基因组中识别片段复制(SDs)的挑战。这进一步为鉴定人类个体基因组中SDs的差异和研究SD的进化提供了可能。这些新出现的挑战需要在最复杂的基因组区域(如快速进化的免疫球蛋白位点)中重建SDs的算法,这些算法回避了所有先前分析其结构的尝试。结果:我们描述了用于推断SDs的GenomeDecoder算法,并将其应用于分析灵长类基因组中不同位点的基因组结构。我们的分析表明,多次重复/缺失导致人类群体中免疫球蛋白基因的快速诞生/死亡,以及灵长类动物基因组中免疫球蛋白位点的基因组结构的巨大变化。免疫球蛋白基因座在灵长类动物基因组中的比较表明,它们受到多样化选择的影响。可用性和实现:GenomeDecoder可在https://github.com/ZhangZhenmiao/GenomeDecoder获得。本文使用的软件版本和测试数据上传到https://doi.org/10.5281/zenodo.14753844.Supplementary信息:补充数据可在Bioinformatics在线获取。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Tractor Workflow: A Scalable Nextflow Framework for Local Ancestry-Aware Genome-Wide Association Studies. Identification of autosomal and sex chromosome aneuploidies using next generation sequencing. HaDeX2: multi-dimensional analysis of Hydrogen-Deuterium Exchange Mass Spectrometry data. Topological model selection: a case-study in tumour-induced angiogenesis. Finding low-complexity DNA sequences with longdust.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1