{"title":"基于拓扑数据分析的蛋白质二级结构预测","authors":"Amir Hassanpour, Habib Izadkhah, A. Isazadeh","doi":"10.1109/ICSPIS54653.2021.9729391","DOIUrl":null,"url":null,"abstract":"Topological data analysis (TDA) is a novel and rapidly growing area of modern data science that uses topological, geometric, and algebraic tools to extract structural features from very complex and large-scale data that are usually incomplete and noisy. The primary motivation for studying this method was to study the shape of data, which has been connected to branches of pure mathematics such as homology, cohomology, and algebraic topology. In this method, the topological space obtained from cloud data can give it an interpretation of distance, continuity, and connectedness so patterns and relationships between the data are discovered quickly. In other words, using this method, the original information can be obtained from the sample or accidental information that was lost or messed up during sampling. Persistent homology is One of the essential tools of TDA. In this paper, after introducing the necessary mathematical concepts, through computing persistent homology and extracting appropriate features, we provide a new dataset, and we then develop a deep learning architecture to predict the protein secondary structure from the constructed dataset. The accuracy of the proposed method is at least 5% higher than the accuracy of the previous methods.","PeriodicalId":286966,"journal":{"name":"2021 7th International Conference on Signal Processing and Intelligent Systems (ICSPIS)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Protein Secondary Structure Prediction using Topological Data Analysis\",\"authors\":\"Amir Hassanpour, Habib Izadkhah, A. Isazadeh\",\"doi\":\"10.1109/ICSPIS54653.2021.9729391\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Topological data analysis (TDA) is a novel and rapidly growing area of modern data science that uses topological, geometric, and algebraic tools to extract structural features from very complex and large-scale data that are usually incomplete and noisy. The primary motivation for studying this method was to study the shape of data, which has been connected to branches of pure mathematics such as homology, cohomology, and algebraic topology. In this method, the topological space obtained from cloud data can give it an interpretation of distance, continuity, and connectedness so patterns and relationships between the data are discovered quickly. In other words, using this method, the original information can be obtained from the sample or accidental information that was lost or messed up during sampling. Persistent homology is One of the essential tools of TDA. In this paper, after introducing the necessary mathematical concepts, through computing persistent homology and extracting appropriate features, we provide a new dataset, and we then develop a deep learning architecture to predict the protein secondary structure from the constructed dataset. The accuracy of the proposed method is at least 5% higher than the accuracy of the previous methods.\",\"PeriodicalId\":286966,\"journal\":{\"name\":\"2021 7th International Conference on Signal Processing and Intelligent Systems (ICSPIS)\",\"volume\":\"2 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-12-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 7th International Conference on Signal Processing and Intelligent Systems (ICSPIS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICSPIS54653.2021.9729391\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 7th International Conference on Signal Processing and Intelligent Systems (ICSPIS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSPIS54653.2021.9729391","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Protein Secondary Structure Prediction using Topological Data Analysis
Topological data analysis (TDA) is a novel and rapidly growing area of modern data science that uses topological, geometric, and algebraic tools to extract structural features from very complex and large-scale data that are usually incomplete and noisy. The primary motivation for studying this method was to study the shape of data, which has been connected to branches of pure mathematics such as homology, cohomology, and algebraic topology. In this method, the topological space obtained from cloud data can give it an interpretation of distance, continuity, and connectedness so patterns and relationships between the data are discovered quickly. In other words, using this method, the original information can be obtained from the sample or accidental information that was lost or messed up during sampling. Persistent homology is One of the essential tools of TDA. In this paper, after introducing the necessary mathematical concepts, through computing persistent homology and extracting appropriate features, we provide a new dataset, and we then develop a deep learning architecture to predict the protein secondary structure from the constructed dataset. The accuracy of the proposed method is at least 5% higher than the accuracy of the previous methods.