A comparative study on existing methodologies to predict dominating patterns amongst biological sequences

G. Lakshmi Priya, S. Hariharan
{"title":"A comparative study on existing methodologies to predict dominating patterns amongst biological sequences","authors":"G. Lakshmi Priya, S. Hariharan","doi":"10.1109/ICOAC.2011.6165177","DOIUrl":null,"url":null,"abstract":"Data Mining is the process of extracting or mining the patterns from very large amount of biological datasets. Utilization of Data mining algorithms can reveal biological relevant associations between different genes and gene expression. In Data Mining, several techniques are available for predicting frequent patterns. One among the technique is association rule mining algorithm; which can be applied for solving the crucial problems faced in the field of biological science. From the literature, various algorithms have been employed in generating frequent patterns for distinct application. These algorithms have some limitations in predicting frequent patterns, such as space, time complexity and accuracy. In order to overcome these drawbacks, the study is made on existing algorithms for generating frequent patterns from the biological sequences. The literature survey gives a significant number of methods were generated for predicting associative patterns. The proposed system has to be developed for solving problems in biological science. Biological sequence may be a collection of DNA sequence, Gene expression sequence or Protein sequence for a specific viral disease. Amino acids are the building blocks of proteins. Proteins are organic compounds made up of amino acids arranged in a linear chain and folded into a globular form. The future proposal not only leads in predicting the frequent patterns; it will also satisfy some factors such as: time complexity, space and predict accurate solution to the required problem. With the help of these three factors into consideration and efficient algorithm can be identified for predicting the dominating amino acids for any kind of specific biological implication.","PeriodicalId":369712,"journal":{"name":"2011 Third International Conference on Advanced Computing","volume":"53 41","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 Third International Conference on Advanced Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICOAC.2011.6165177","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5

Abstract

Data Mining is the process of extracting or mining the patterns from very large amount of biological datasets. Utilization of Data mining algorithms can reveal biological relevant associations between different genes and gene expression. In Data Mining, several techniques are available for predicting frequent patterns. One among the technique is association rule mining algorithm; which can be applied for solving the crucial problems faced in the field of biological science. From the literature, various algorithms have been employed in generating frequent patterns for distinct application. These algorithms have some limitations in predicting frequent patterns, such as space, time complexity and accuracy. In order to overcome these drawbacks, the study is made on existing algorithms for generating frequent patterns from the biological sequences. The literature survey gives a significant number of methods were generated for predicting associative patterns. The proposed system has to be developed for solving problems in biological science. Biological sequence may be a collection of DNA sequence, Gene expression sequence or Protein sequence for a specific viral disease. Amino acids are the building blocks of proteins. Proteins are organic compounds made up of amino acids arranged in a linear chain and folded into a globular form. The future proposal not only leads in predicting the frequent patterns; it will also satisfy some factors such as: time complexity, space and predict accurate solution to the required problem. With the help of these three factors into consideration and efficient algorithm can be identified for predicting the dominating amino acids for any kind of specific biological implication.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
现有生物序列支配模式预测方法的比较研究
数据挖掘是从大量的生物数据集中提取或挖掘模式的过程。利用数据挖掘算法可以揭示不同基因和基因表达之间的生物学相关性。在数据挖掘中,有几种技术可用于预测频繁模式。其中一种技术是关联规则挖掘算法;可以应用于解决生物科学领域面临的关键问题。从文献中,各种算法已被用于为不同的应用生成频繁模式。这些算法在预测频繁模式方面存在一定的局限性,如空间、时间复杂度和准确性。为了克服这些缺点,对现有的生物序列频繁模式生成算法进行了研究。文献综述给出了预测联想模式的大量方法。所提出的系统必须用于解决生物科学中的问题。生物序列可以是特定病毒疾病的DNA序列、基因表达序列或蛋白质序列的集合。氨基酸是蛋白质的基本成分。蛋白质是由排列成线性链并折叠成球形的氨基酸组成的有机化合物。未来的建议不仅可以预测频繁的模式;它还能满足时间复杂度、空间复杂度等因素,预测所需问题的准确解。考虑到这三个因素,可以确定有效的算法来预测任何一种特定生物学意义的主导氨基酸。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Keynote speaker I: Ubiquitous sensing Bio-molecular event extraction using Support Vector Machine Genetically optimized ANFIS based Intelligent Navigation System An efficient clusterhead election algorithm based on maximum weight for MANET A novel business model for enterprise service logic change management
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1