{"title":"一种新的快速蛋白质分类和进化分析方法","authors":"Liang Ai, Jie Feng, Yu-Hua Yao","doi":"10.46793/match.90-2.381a","DOIUrl":null,"url":null,"abstract":"In this paper, we propose a new fast alignment-free method for protein sequence similarity and evolutionary analysis. First 20 natural amino acids are clustered into 6 groups based on their physicochemical properties, then a 12-dimensional vector is constructed based on the frequency and the average position of occurrence of amino acids in each reduced amino acid sequences. Finally, the Euclidean distance is used to measure the similarity and evolutionary distance between protein sequences. The test on three datasets shows that our method can cluster each protein sequence accurately, which illustrates the effective of our method.","PeriodicalId":51115,"journal":{"name":"Match-Communications in Mathematical and in Computer Chemistry","volume":"56 1","pages":""},"PeriodicalIF":2.9000,"publicationDate":"2023-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Novel Fast Approach for Protein Classification and Evolutionary Analysis\",\"authors\":\"Liang Ai, Jie Feng, Yu-Hua Yao\",\"doi\":\"10.46793/match.90-2.381a\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we propose a new fast alignment-free method for protein sequence similarity and evolutionary analysis. First 20 natural amino acids are clustered into 6 groups based on their physicochemical properties, then a 12-dimensional vector is constructed based on the frequency and the average position of occurrence of amino acids in each reduced amino acid sequences. Finally, the Euclidean distance is used to measure the similarity and evolutionary distance between protein sequences. The test on three datasets shows that our method can cluster each protein sequence accurately, which illustrates the effective of our method.\",\"PeriodicalId\":51115,\"journal\":{\"name\":\"Match-Communications in Mathematical and in Computer Chemistry\",\"volume\":\"56 1\",\"pages\":\"\"},\"PeriodicalIF\":2.9000,\"publicationDate\":\"2023-04-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Match-Communications in Mathematical and in Computer Chemistry\",\"FirstCategoryId\":\"92\",\"ListUrlMain\":\"https://doi.org/10.46793/match.90-2.381a\",\"RegionNum\":2,\"RegionCategory\":\"化学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"CHEMISTRY, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Match-Communications in Mathematical and in Computer Chemistry","FirstCategoryId":"92","ListUrlMain":"https://doi.org/10.46793/match.90-2.381a","RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"CHEMISTRY, MULTIDISCIPLINARY","Score":null,"Total":0}
A Novel Fast Approach for Protein Classification and Evolutionary Analysis
In this paper, we propose a new fast alignment-free method for protein sequence similarity and evolutionary analysis. First 20 natural amino acids are clustered into 6 groups based on their physicochemical properties, then a 12-dimensional vector is constructed based on the frequency and the average position of occurrence of amino acids in each reduced amino acid sequences. Finally, the Euclidean distance is used to measure the similarity and evolutionary distance between protein sequences. The test on three datasets shows that our method can cluster each protein sequence accurately, which illustrates the effective of our method.
期刊介绍:
MATCH Communications in Mathematical and in Computer Chemistry publishes papers of original research as well as reviews on chemically important mathematical results and non-routine applications of mathematical techniques to chemical problems. A paper acceptable for publication must contain non-trivial mathematics or communicate non-routine computer-based procedures AND have a clear connection to chemistry. Papers are published without any processing or publication charge.