Data mining techniques to predict protein secondary structures

2013 5th International Conference on Modeling, Simulation and Applied Optimization (ICMSAO) Pub Date : 2013-04-28 DOI:10.1109/ICMSAO.2013.6552701

Sondes Fayech, N. Essoussi, M. Limam

引用次数: 1

Abstract

Protein secondary structure prediction is a key step in prediction of protein tertiary structure. There have emerged many methods based on machine learning techniques, such as neural networks (NN) and support vector machines (SVM), to focus on the prediction of the secondary structures. In this paper a new method, DM-pred, was proposed based on a protein clustering method to detect homologous sequences, a sequential pattern mining method to detect frequent patterns, features extraction and quantification approaches to prepare features and SVM method to predict structures. When tested on the most popular secondary structure datasets, DM-pred achieved a Q3 accuracy of 78.20% and a SOV of 76.49% which illustrates that it is one of the top range methods for protein secondary structure prediction.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

预测蛋白质二级结构的数据挖掘技术

蛋白质二级结构预测是蛋白质三级结构预测的关键步骤。基于神经网络(NN)和支持向量机(SVM)等机器学习技术，出现了许多用于二级结构预测的方法。本文提出了一种基于蛋白质聚类方法检测同源序列、序列模式挖掘方法检测频繁模式、特征提取和量化方法制备特征、支持向量机方法预测结构的新方法DM-pred。在最流行的二级结构数据集上进行测试时，DM-pred的Q3准确率为78.20%，SOV为76.49%，说明它是蛋白质二级结构预测的顶级方法之一。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2013 5th International Conference on Modeling, Simulation and Applied Optimization (ICMSAO)

自引率

0.00%

发文量