基于拓扑数据分析的蛋白质二级结构预测

Amir Hassanpour, Habib Izadkhah, A. Isazadeh
{"title":"基于拓扑数据分析的蛋白质二级结构预测","authors":"Amir Hassanpour, Habib Izadkhah, A. Isazadeh","doi":"10.1109/ICSPIS54653.2021.9729391","DOIUrl":null,"url":null,"abstract":"Topological data analysis (TDA) is a novel and rapidly growing area of modern data science that uses topological, geometric, and algebraic tools to extract structural features from very complex and large-scale data that are usually incomplete and noisy. The primary motivation for studying this method was to study the shape of data, which has been connected to branches of pure mathematics such as homology, cohomology, and algebraic topology. In this method, the topological space obtained from cloud data can give it an interpretation of distance, continuity, and connectedness so patterns and relationships between the data are discovered quickly. In other words, using this method, the original information can be obtained from the sample or accidental information that was lost or messed up during sampling. Persistent homology is One of the essential tools of TDA. In this paper, after introducing the necessary mathematical concepts, through computing persistent homology and extracting appropriate features, we provide a new dataset, and we then develop a deep learning architecture to predict the protein secondary structure from the constructed dataset. The accuracy of the proposed method is at least 5% higher than the accuracy of the previous methods.","PeriodicalId":286966,"journal":{"name":"2021 7th International Conference on Signal Processing and Intelligent Systems (ICSPIS)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Protein Secondary Structure Prediction using Topological Data Analysis\",\"authors\":\"Amir Hassanpour, Habib Izadkhah, A. Isazadeh\",\"doi\":\"10.1109/ICSPIS54653.2021.9729391\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Topological data analysis (TDA) is a novel and rapidly growing area of modern data science that uses topological, geometric, and algebraic tools to extract structural features from very complex and large-scale data that are usually incomplete and noisy. The primary motivation for studying this method was to study the shape of data, which has been connected to branches of pure mathematics such as homology, cohomology, and algebraic topology. In this method, the topological space obtained from cloud data can give it an interpretation of distance, continuity, and connectedness so patterns and relationships between the data are discovered quickly. In other words, using this method, the original information can be obtained from the sample or accidental information that was lost or messed up during sampling. Persistent homology is One of the essential tools of TDA. In this paper, after introducing the necessary mathematical concepts, through computing persistent homology and extracting appropriate features, we provide a new dataset, and we then develop a deep learning architecture to predict the protein secondary structure from the constructed dataset. The accuracy of the proposed method is at least 5% higher than the accuracy of the previous methods.\",\"PeriodicalId\":286966,\"journal\":{\"name\":\"2021 7th International Conference on Signal Processing and Intelligent Systems (ICSPIS)\",\"volume\":\"2 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-12-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 7th International Conference on Signal Processing and Intelligent Systems (ICSPIS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICSPIS54653.2021.9729391\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 7th International Conference on Signal Processing and Intelligent Systems (ICSPIS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSPIS54653.2021.9729391","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

拓扑数据分析(TDA)是现代数据科学中一个新兴且快速发展的领域,它使用拓扑、几何和代数工具从通常不完整和有噪声的非常复杂和大规模的数据中提取结构特征。研究这种方法的主要动机是研究数据的形状,这与纯数学的分支,如同调、上同调和代数拓扑有关。在这种方法中,从云数据中获得的拓扑空间可以给出距离、连续性和连通性的解释,从而快速发现数据之间的模式和关系。换句话说,通过这种方法,可以从样本中获得原始信息,也可以从采样过程中丢失或混乱的偶然信息中获得原始信息。持久同源性是TDA的重要工具之一。在引入必要的数学概念后,通过计算持久同源性并提取适当的特征,我们提供了一个新的数据集,然后我们开发了一个深度学习架构,从构建的数据集中预测蛋白质的二级结构。该方法的精度比以往方法的精度至少提高5%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Protein Secondary Structure Prediction using Topological Data Analysis
Topological data analysis (TDA) is a novel and rapidly growing area of modern data science that uses topological, geometric, and algebraic tools to extract structural features from very complex and large-scale data that are usually incomplete and noisy. The primary motivation for studying this method was to study the shape of data, which has been connected to branches of pure mathematics such as homology, cohomology, and algebraic topology. In this method, the topological space obtained from cloud data can give it an interpretation of distance, continuity, and connectedness so patterns and relationships between the data are discovered quickly. In other words, using this method, the original information can be obtained from the sample or accidental information that was lost or messed up during sampling. Persistent homology is One of the essential tools of TDA. In this paper, after introducing the necessary mathematical concepts, through computing persistent homology and extracting appropriate features, we provide a new dataset, and we then develop a deep learning architecture to predict the protein secondary structure from the constructed dataset. The accuracy of the proposed method is at least 5% higher than the accuracy of the previous methods.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Intelligent Fault Diagnosis of Rolling BearingBased on Deep Transfer Learning Using Time-Frequency Representation Wind Energy Potential Approximation with Various Metaheuristic Optimization Techniques Deployment Listening to Sounds of Silence for Audio replay attack detection Transcranial Magnetic Stimulation of Prefrontal Cortex Alters Functional Brain Network Architecture: Graph Theoretical Analysis Anomaly Detection and Resilience-Oriented Countermeasures against Cyberattacks in Smart Grids
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1