Top-k基于效用的基因调控序列模式发现

Morteza Zihayat, Heydar Davoudi, Aijun An
{"title":"Top-k基于效用的基因调控序列模式发现","authors":"Morteza Zihayat, Heydar Davoudi, Aijun An","doi":"10.1109/BIBM.2016.7822529","DOIUrl":null,"url":null,"abstract":"Sequential pattern mining has been used in bioinformatics to discover frequent gene regulation sequential patterns based on time course microarray datasets. While mining frequent sequences are important in biological studies for disease treatment, to date, most of the approaches do not consider the importance of the genes with respect to a disease being studied when identifying gene regulation sequential patterns. In addition, they focus on the more general up/down effects of genes in a microarray dataset and do not take into account the various degrees of expression during the mining process. As a result, the current techniques return too many sequences which may not be informative enough for biologists to explore relationships between the disease and underlying causes encoded in gene regulation sequences. In this paper, we propose a utility model by considering both the importance of genes with respect to a disease and their degrees of expression levels under a biological investigation. Then, we design a new method, called TU-SEQ, for identifying top-k high utility gene regulation sequential patterns from a time-course microarray dataset. The evaluation results show that our approach can effectively and efficiently discover key patterns representing meaningful gene regulation sequential patterns in a time course microarray dataset.","PeriodicalId":345384,"journal":{"name":"2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"13","resultStr":"{\"title\":\"Top-k utility-based gene regulation sequential pattern discovery\",\"authors\":\"Morteza Zihayat, Heydar Davoudi, Aijun An\",\"doi\":\"10.1109/BIBM.2016.7822529\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Sequential pattern mining has been used in bioinformatics to discover frequent gene regulation sequential patterns based on time course microarray datasets. While mining frequent sequences are important in biological studies for disease treatment, to date, most of the approaches do not consider the importance of the genes with respect to a disease being studied when identifying gene regulation sequential patterns. In addition, they focus on the more general up/down effects of genes in a microarray dataset and do not take into account the various degrees of expression during the mining process. As a result, the current techniques return too many sequences which may not be informative enough for biologists to explore relationships between the disease and underlying causes encoded in gene regulation sequences. In this paper, we propose a utility model by considering both the importance of genes with respect to a disease and their degrees of expression levels under a biological investigation. Then, we design a new method, called TU-SEQ, for identifying top-k high utility gene regulation sequential patterns from a time-course microarray dataset. The evaluation results show that our approach can effectively and efficiently discover key patterns representing meaningful gene regulation sequential patterns in a time course microarray dataset.\",\"PeriodicalId\":345384,\"journal\":{\"name\":\"2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)\",\"volume\":\"20 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"13\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/BIBM.2016.7822529\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/BIBM.2016.7822529","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 13

摘要

序列模式挖掘已被应用于生物信息学中,用于发现基于时序微阵列数据集的频繁基因调控序列模式。虽然挖掘频繁序列在疾病治疗的生物学研究中很重要,但迄今为止,在确定基因调控序列模式时,大多数方法都没有考虑到基因对所研究疾病的重要性。此外,他们关注的是基因在微阵列数据集中更普遍的上/下效应,而没有考虑到挖掘过程中不同程度的表达。因此,目前的技术返回了太多的序列,这些序列可能不足以为生物学家探索基因调控序列中编码的疾病和潜在原因之间的关系提供足够的信息。在本文中,我们提出了一种实用新型,通过考虑基因对疾病的重要性及其在生物学研究中的表达程度。然后,我们设计了一种名为TU-SEQ的新方法,用于从时间过程微阵列数据集中识别top-k高效用基因调控序列模式。评估结果表明,我们的方法可以有效地在时间过程微阵列数据集中发现代表有意义的基因调控序列模式的关键模式。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Top-k utility-based gene regulation sequential pattern discovery
Sequential pattern mining has been used in bioinformatics to discover frequent gene regulation sequential patterns based on time course microarray datasets. While mining frequent sequences are important in biological studies for disease treatment, to date, most of the approaches do not consider the importance of the genes with respect to a disease being studied when identifying gene regulation sequential patterns. In addition, they focus on the more general up/down effects of genes in a microarray dataset and do not take into account the various degrees of expression during the mining process. As a result, the current techniques return too many sequences which may not be informative enough for biologists to explore relationships between the disease and underlying causes encoded in gene regulation sequences. In this paper, we propose a utility model by considering both the importance of genes with respect to a disease and their degrees of expression levels under a biological investigation. Then, we design a new method, called TU-SEQ, for identifying top-k high utility gene regulation sequential patterns from a time-course microarray dataset. The evaluation results show that our approach can effectively and efficiently discover key patterns representing meaningful gene regulation sequential patterns in a time course microarray dataset.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
The role of high performance, grid and cloud computing in high-throughput sequencing A novel algorithm for identifying essential proteins by integrating subcellular localization CNNsite: Prediction of DNA-binding residues in proteins using Convolutional Neural Network with sequence features Inferring Social Influence of anti-Tobacco mass media campaigns Emotion recognition from multi-channel EEG data through Convolutional Recurrent Neural Network
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1