Organizational Heterogeneity of the Human Genome: Significant Variation of Recombination Rate of 100 kbp Sequences within GC Ranges

S. Frenkel, V. Kirzhner, Z. Frenkel, A. Korol
{"title":"Organizational Heterogeneity of the Human Genome: Significant Variation of Recombination Rate of 100 kbp Sequences within GC Ranges","authors":"S. Frenkel, V. Kirzhner, Z. Frenkel, A. Korol","doi":"10.1109/SMRLO.2016.72","DOIUrl":null,"url":null,"abstract":"The association of nucleotide composition of genome sequences with their functional characteristics is widely known, among the most studied characteristics correlated with GC content are gene density and expression and recombination rate. Previously, we found that similar in nucleotide composition genomic regions may exhibit considerable differences in sequence organization and hypothesized that organizationally different regions may also exhibit functional and evolutionary heterogeneity. Here we examine this hypothesis by classifying 100 kbp segments of human genome into 14 compositionally homogeneous groups according to their GC content and differentiating the segments within each group by organization patterns (OP) using oligonucleotide (k-mer) counting, referred to as Compositional Spectra (CS) Analysis. We identified 141 groups of segments different in their CS organization and found that obtained compositionally similar OP groups (OPG) differ significantly in their recombination rate. This conclusion was robust with respect to the selected window size (confirmed by independent analysis for 50 kb and 200 kb segments). We further performed a test of contribution of specific k-mers in clustering of 100 kbp segments to OPGs with contrast levels of recombination rates. Eight k-mers, which demonstrated highest importance for such clustering, allowed correct classification at least 76% of segments in all 14 OPG pairs. Moreover, these k-mers proved similar with five previously described patterns related to recombination hotspots including the most known 13 bp recombination motif CCNCCNTNNCCNC.","PeriodicalId":254910,"journal":{"name":"2016 Second International Symposium on Stochastic Models in Reliability Engineering, Life Science and Operations Management (SMRLO)","volume":"32 3 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-02-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 Second International Symposium on Stochastic Models in Reliability Engineering, Life Science and Operations Management (SMRLO)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SMRLO.2016.72","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

The association of nucleotide composition of genome sequences with their functional characteristics is widely known, among the most studied characteristics correlated with GC content are gene density and expression and recombination rate. Previously, we found that similar in nucleotide composition genomic regions may exhibit considerable differences in sequence organization and hypothesized that organizationally different regions may also exhibit functional and evolutionary heterogeneity. Here we examine this hypothesis by classifying 100 kbp segments of human genome into 14 compositionally homogeneous groups according to their GC content and differentiating the segments within each group by organization patterns (OP) using oligonucleotide (k-mer) counting, referred to as Compositional Spectra (CS) Analysis. We identified 141 groups of segments different in their CS organization and found that obtained compositionally similar OP groups (OPG) differ significantly in their recombination rate. This conclusion was robust with respect to the selected window size (confirmed by independent analysis for 50 kb and 200 kb segments). We further performed a test of contribution of specific k-mers in clustering of 100 kbp segments to OPGs with contrast levels of recombination rates. Eight k-mers, which demonstrated highest importance for such clustering, allowed correct classification at least 76% of segments in all 14 OPG pairs. Moreover, these k-mers proved similar with five previously described patterns related to recombination hotspots including the most known 13 bp recombination motif CCNCCNTNNCCNC.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
人类基因组的组织异质性:100 kbp序列在GC范围内重组率的显著差异
基因组序列的核苷酸组成与其功能特征的关系是众所周知的,与GC含量相关的基因密度、表达和重组率是研究最多的特征。先前,我们发现核苷酸组成相似的基因组区域可能在序列组织上表现出相当大的差异,并假设组织不同的区域也可能表现出功能和进化的异质性。在这里,我们通过将人类基因组的100 kbp片段根据其GC含量分为14个组成均匀的组,并使用寡核苷酸(k-mer)计数(称为成分光谱(CS)分析)通过组织模式(OP)区分每组中的片段,来检验这一假设。我们鉴定了141组CS组织不同的片段,发现获得的成分相似的OP组(OPG)在重组率上存在显著差异。这个结论对于选择的窗口大小是可靠的(通过对50 kb和200 kb片段的独立分析证实)。我们进一步测试了100 kbp片段集群中特定k-mers对OPGs的贡献,并对比了重组率水平。8个k-mers对这种聚类表现出最高的重要性,在所有14对OPG中至少有76%的片段被正确分类。此外,这些k-mers被证明与先前描述的与重组热点相关的五种模式相似,包括最著名的13 bp重组基序CCNCCNTNNCCNC。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Health - Promoting Nature of the Urban Space Stochastic Analysis of Systems Exposed to Very Unlikely Faults In Memory of Professor Igor Ushakov: In Memory of Our Colleague and Friend Holistic Approach to Passenger Terminal Risk Estimation Effective Bandwidth Estimation in Highly Reliable Regenerative Networks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1