Palindromes in SARS and Other Coronaviruses.

IF 2.3 4区 计算机科学 Q3 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Informs Journal on Computing Pub Date : 2004-01-01 DOI:10.1287/ijoc.1040.0087
David S H Chew, Kwok Pui Choi, Hans Heidner, Ming-Ying Leung
{"title":"Palindromes in SARS and Other Coronaviruses.","authors":"David S H Chew, Kwok Pui Choi, Hans Heidner, Ming-Ying Leung","doi":"10.1287/ijoc.1040.0087","DOIUrl":null,"url":null,"abstract":"<p><p>With the identification of a novel coronavirus associated with the <i>severe acute respiratory syndrome</i> (SARS), computational analysis of its RNA genome sequence is expected to give useful clues to help elucidate the origin, evolution, and pathogenicity of the virus. In this paper, we study the collective counts of palindromes in the SARS genome along with all the completely sequenced coronaviruses. Based on a Markov-chain model for the genome sequence, the mean and standard deviation for the number of palindromes at or above a given length are derived. These theoretical results are complemented by extensive simulations to provide empirical estimates. Using a <i>z</i> score obtained from these mathematical and empirical means and standard deviations, we have observed that palindromes of length four are significantly underrepresented in all the coronaviruses in our data set. In contrast, length-six palindromes are significantly underrepresented only in the SARS coronavirus. Two other features are unique to the SARS sequence. First, there is a length-22 palindrome TCTTTAACAAGCTTGTTAAAGA spanning positions 25962-25983. Second, there are two repeating length-12 palindromes TTATAATTATAA spanning positions 22712-22723 and 22796-22807. Some further investigations into possible biological implications of these palindrome features are proposed.</p>","PeriodicalId":13620,"journal":{"name":"Informs Journal on Computing","volume":"16 4","pages":"331-340"},"PeriodicalIF":2.3000,"publicationDate":"2004-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4066412/pdf/nihms583805.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Informs Journal on Computing","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1287/ijoc.1040.0087","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0

Abstract

With the identification of a novel coronavirus associated with the severe acute respiratory syndrome (SARS), computational analysis of its RNA genome sequence is expected to give useful clues to help elucidate the origin, evolution, and pathogenicity of the virus. In this paper, we study the collective counts of palindromes in the SARS genome along with all the completely sequenced coronaviruses. Based on a Markov-chain model for the genome sequence, the mean and standard deviation for the number of palindromes at or above a given length are derived. These theoretical results are complemented by extensive simulations to provide empirical estimates. Using a z score obtained from these mathematical and empirical means and standard deviations, we have observed that palindromes of length four are significantly underrepresented in all the coronaviruses in our data set. In contrast, length-six palindromes are significantly underrepresented only in the SARS coronavirus. Two other features are unique to the SARS sequence. First, there is a length-22 palindrome TCTTTAACAAGCTTGTTAAAGA spanning positions 25962-25983. Second, there are two repeating length-12 palindromes TTATAATTATAA spanning positions 22712-22723 and 22796-22807. Some further investigations into possible biological implications of these palindrome features are proposed.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
SARS 和其他冠状病毒中的倒位基因
随着一种与严重急性呼吸系统综合症(SARS)相关的新型冠状病毒的发现,对其 RNA 基因组序列的计算分析有望提供有用的线索,帮助阐明该病毒的起源、进化和致病性。在本文中,我们研究了 SARS 基因组和所有完全测序的冠状病毒中的回文编码的集合计数。根据基因组序列的马尔可夫链模型,我们得出了达到或超过给定长度的回文键数目的平均值和标准偏差。这些理论结果得到了大量模拟的补充,从而提供了经验估计值。利用从这些数学和经验平均值和标准偏差中得到的 z 分数,我们观察到长度为 4 的回文键在我们数据集中的所有冠状病毒中都明显偏低。相比之下,只有 SARS 冠状病毒中的长度为 6 的回文键数目明显偏低。SARS 序列还有两个独特的特征。首先,有一个长度为 22 的回文染色体 TCTTTAACAAGCTTGTTAAAGA 跨 25962-25983 位。其次,在 22712-22723 和 22796-22807 位置有两个重复的长度为 12 的回文染色体 TTATAATTATAA。建议进一步研究这些回文特征可能具有的生物学意义。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Informs Journal on Computing
Informs Journal on Computing 工程技术-计算机:跨学科应用
CiteScore
4.20
自引率
14.30%
发文量
162
审稿时长
7.5 months
期刊介绍: The INFORMS Journal on Computing (JOC) is a quarterly that publishes papers in the intersection of operations research (OR) and computer science (CS). Most papers contain original research, but we also welcome special papers in a variety of forms, including Feature Articles on timely topics, Expository Reviews making a comprehensive survey and evaluation of a subject area, and State-of-the-Art Reviews that collect and integrate recent streams of research.
期刊最新文献
Iterative Rule Extension for Logic Analysis of Data: An MILP-Based Heuristic to Derive Interpretable Binary Classifiers from Large Data Sets Approximate Kernel Learning Uncertainty Set for Robust Combinatorial Optimization Solving Sparse Separable Bilinear Programs Using Lifted Bilinear Cover Inequalities Note from the Editor Efficient Propagation Techniques for Handling Cyclic Symmetries in Binary Programs
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1