基于3-ary Huffman编码的蛋白质序列比较方法

IF 2.9 2区 化学 Q2 CHEMISTRY, MULTIDISCIPLINARY Match-Communications in Mathematical and in Computer Chemistry Pub Date : 2023-04-01 DOI:10.46793/match.90-2.357q
Zhaohui Qi, Yingqiang Ning, Yinmei Huang
{"title":"基于3-ary Huffman编码的蛋白质序列比较方法","authors":"Zhaohui Qi, Yingqiang Ning, Yinmei Huang","doi":"10.46793/match.90-2.357q","DOIUrl":null,"url":null,"abstract":"Based on 3-ary Huffman coding algorithm, we propose a digital mapping method of protein sequence. Firstly, a 3-ary Huffman tree is defined by the frequency characteristic of 20 amino acids in given protein sequences. The 0-2 codes of 20 amino acids constructed by the 3-ary Huffman tree can convert long protein sequences into one-to-one 0-2 digital sequences. According to the frequency characteristic and the distribution information of 0-2 codes of 20 amino acids in the 0-2 digital sequences, we design the 40-dimensional vectors to characterize the protein sequences. Next, the proposed digital mapping method is used to perform three separate applications, similarity comparison of nine ND6 proteins, evolutionary trend analysis of the 2009 pandemic Human influenza A (H1N1) viruses from January 2020 to June 2022, and the evolution analysis of 95 coronavirus genes. The results illustrate the utility of the proposed method.","PeriodicalId":51115,"journal":{"name":"Match-Communications in Mathematical and in Computer Chemistry","volume":"1 1","pages":""},"PeriodicalIF":2.9000,"publicationDate":"2023-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Protein Sequence Comparison Method Based on 3-ary Huffman Coding\",\"authors\":\"Zhaohui Qi, Yingqiang Ning, Yinmei Huang\",\"doi\":\"10.46793/match.90-2.357q\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Based on 3-ary Huffman coding algorithm, we propose a digital mapping method of protein sequence. Firstly, a 3-ary Huffman tree is defined by the frequency characteristic of 20 amino acids in given protein sequences. The 0-2 codes of 20 amino acids constructed by the 3-ary Huffman tree can convert long protein sequences into one-to-one 0-2 digital sequences. According to the frequency characteristic and the distribution information of 0-2 codes of 20 amino acids in the 0-2 digital sequences, we design the 40-dimensional vectors to characterize the protein sequences. Next, the proposed digital mapping method is used to perform three separate applications, similarity comparison of nine ND6 proteins, evolutionary trend analysis of the 2009 pandemic Human influenza A (H1N1) viruses from January 2020 to June 2022, and the evolution analysis of 95 coronavirus genes. The results illustrate the utility of the proposed method.\",\"PeriodicalId\":51115,\"journal\":{\"name\":\"Match-Communications in Mathematical and in Computer Chemistry\",\"volume\":\"1 1\",\"pages\":\"\"},\"PeriodicalIF\":2.9000,\"publicationDate\":\"2023-04-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Match-Communications in Mathematical and in Computer Chemistry\",\"FirstCategoryId\":\"92\",\"ListUrlMain\":\"https://doi.org/10.46793/match.90-2.357q\",\"RegionNum\":2,\"RegionCategory\":\"化学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"CHEMISTRY, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Match-Communications in Mathematical and in Computer Chemistry","FirstCategoryId":"92","ListUrlMain":"https://doi.org/10.46793/match.90-2.357q","RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"CHEMISTRY, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0

摘要

基于三进位霍夫曼编码算法,提出了一种蛋白质序列的数字映射方法。首先,根据给定蛋白质序列中20个氨基酸的频率特征,定义3-ary Huffman树;由3-ary Huffman树构建的20个氨基酸的0-2编码可以将长蛋白质序列转换成一对一的0-2数字序列。根据0-2数字序列中20个氨基酸的0-2编码的频率特征和分布信息,设计了40维载体来表征蛋白质序列。接下来,利用所提出的数字作图方法对9个ND6蛋白的相似性比较、2009年甲型H1N1流感大流行病毒2020年1月至2022年6月的进化趋势分析和95个冠状病毒基因的进化分析进行了3个单独的应用。结果表明了该方法的有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Protein Sequence Comparison Method Based on 3-ary Huffman Coding
Based on 3-ary Huffman coding algorithm, we propose a digital mapping method of protein sequence. Firstly, a 3-ary Huffman tree is defined by the frequency characteristic of 20 amino acids in given protein sequences. The 0-2 codes of 20 amino acids constructed by the 3-ary Huffman tree can convert long protein sequences into one-to-one 0-2 digital sequences. According to the frequency characteristic and the distribution information of 0-2 codes of 20 amino acids in the 0-2 digital sequences, we design the 40-dimensional vectors to characterize the protein sequences. Next, the proposed digital mapping method is used to perform three separate applications, similarity comparison of nine ND6 proteins, evolutionary trend analysis of the 2009 pandemic Human influenza A (H1N1) viruses from January 2020 to June 2022, and the evolution analysis of 95 coronavirus genes. The results illustrate the utility of the proposed method.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
4.40
自引率
26.90%
发文量
71
审稿时长
2 months
期刊介绍: MATCH Communications in Mathematical and in Computer Chemistry publishes papers of original research as well as reviews on chemically important mathematical results and non-routine applications of mathematical techniques to chemical problems. A paper acceptable for publication must contain non-trivial mathematics or communicate non-routine computer-based procedures AND have a clear connection to chemistry. Papers are published without any processing or publication charge.
期刊最新文献
ChemCNet: An Explainable Integrated Model for Intelligent Analyzing Chemistry Synthesis Reactions Asymptotic Distribution of Degree-Based Topological Indices Note on the Minimum Bond Incident Degree Indices of k-Cyclic Graphs Sombor Index of Hypergraphs The ABC Index Conundrum's Complete Solution
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1