A comparative study on existing methodologies to predict dominating patterns amongst biological sequences

2011 Third International Conference on Advanced Computing Pub Date : 2011-12-01 DOI:10.1109/ICOAC.2011.6165177

G. Lakshmi Priya, S. Hariharan

{"title":"A comparative study on existing methodologies to predict dominating patterns amongst biological sequences","authors":"G. Lakshmi Priya, S. Hariharan","doi":"10.1109/ICOAC.2011.6165177","DOIUrl":null,"url":null,"abstract":"Data Mining is the process of extracting or mining the patterns from very large amount of biological datasets. Utilization of Data mining algorithms can reveal biological relevant associations between different genes and gene expression. In Data Mining, several techniques are available for predicting frequent patterns. One among the technique is association rule mining algorithm; which can be applied for solving the crucial problems faced in the field of biological science. From the literature, various algorithms have been employed in generating frequent patterns for distinct application. These algorithms have some limitations in predicting frequent patterns, such as space, time complexity and accuracy. In order to overcome these drawbacks, the study is made on existing algorithms for generating frequent patterns from the biological sequences. The literature survey gives a significant number of methods were generated for predicting associative patterns. The proposed system has to be developed for solving problems in biological science. Biological sequence may be a collection of DNA sequence, Gene expression sequence or Protein sequence for a specific viral disease. Amino acids are the building blocks of proteins. Proteins are organic compounds made up of amino acids arranged in a linear chain and folded into a globular form. The future proposal not only leads in predicting the frequent patterns; it will also satisfy some factors such as: time complexity, space and predict accurate solution to the required problem. With the help of these three factors into consideration and efficient algorithm can be identified for predicting the dominating amino acids for any kind of specific biological implication.","PeriodicalId":369712,"journal":{"name":"2011 Third International Conference on Advanced Computing","volume":"53 41","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 Third International Conference on Advanced Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICOAC.2011.6165177","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 5

Abstract

Data Mining is the process of extracting or mining the patterns from very large amount of biological datasets. Utilization of Data mining algorithms can reveal biological relevant associations between different genes and gene expression. In Data Mining, several techniques are available for predicting frequent patterns. One among the technique is association rule mining algorithm; which can be applied for solving the crucial problems faced in the field of biological science. From the literature, various algorithms have been employed in generating frequent patterns for distinct application. These algorithms have some limitations in predicting frequent patterns, such as space, time complexity and accuracy. In order to overcome these drawbacks, the study is made on existing algorithms for generating frequent patterns from the biological sequences. The literature survey gives a significant number of methods were generated for predicting associative patterns. The proposed system has to be developed for solving problems in biological science. Biological sequence may be a collection of DNA sequence, Gene expression sequence or Protein sequence for a specific viral disease. Amino acids are the building blocks of proteins. Proteins are organic compounds made up of amino acids arranged in a linear chain and folded into a globular form. The future proposal not only leads in predicting the frequent patterns; it will also satisfy some factors such as: time complexity, space and predict accurate solution to the required problem. With the help of these three factors into consideration and efficient algorithm can be identified for predicting the dominating amino acids for any kind of specific biological implication.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

现有生物序列支配模式预测方法的比较研究

数据挖掘是从大量的生物数据集中提取或挖掘模式的过程。利用数据挖掘算法可以揭示不同基因和基因表达之间的生物学相关性。在数据挖掘中，有几种技术可用于预测频繁模式。其中一种技术是关联规则挖掘算法;可以应用于解决生物科学领域面临的关键问题。从文献中，各种算法已被用于为不同的应用生成频繁模式。这些算法在预测频繁模式方面存在一定的局限性，如空间、时间复杂度和准确性。为了克服这些缺点，对现有的生物序列频繁模式生成算法进行了研究。文献综述给出了预测联想模式的大量方法。所提出的系统必须用于解决生物科学中的问题。生物序列可以是特定病毒疾病的DNA序列、基因表达序列或蛋白质序列的集合。氨基酸是蛋白质的基本成分。蛋白质是由排列成线性链并折叠成球形的氨基酸组成的有机化合物。未来的建议不仅可以预测频繁的模式;它还能满足时间复杂度、空间复杂度等因素，预测所需问题的准确解。考虑到这三个因素，可以确定有效的算法来预测任何一种特定生物学意义的主导氨基酸。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2011 Third International Conference on Advanced Computing

自引率

0.00%

发文量

期刊最新文献

Keynote speaker I: Ubiquitous sensing Bio-molecular event extraction using Support Vector Machine Genetically optimized ANFIS based Intelligent Navigation System An efficient clusterhead election algorithm based on maximum weight for MANET A novel business model for enterprise service logic change management