Rule-based Assembly for Short Read Data Set obtained with Multiple Assemblers and k-mer Sizes (情報論的学習理論と機械学習)

Aya Oshiro, H. Afuso, T. Okazaki
{"title":"Rule-based Assembly for Short Read Data Set obtained with Multiple Assemblers and k-mer Sizes (情報論的学習理論と機械学習)","authors":"Aya Oshiro, H. Afuso, T. Okazaki","doi":"10.2197/IPSJTBIO.10.9","DOIUrl":null,"url":null,"abstract":"Various de novo assembly methods based on the concept of k-mer have been proposed. Despite the success of these methods, an alternative approach, referred to as the hybrid approach, has recently been proposed that combines different traditional methods to effectively exploit each of their properties in an integrated manner. However, the results obtained from the traditional methods used in the hybrid approach depend not only on the specific algorithm or heuristics but also on the selection of a user-specific k-mer size. Consequently, the results obtained with the hybrid approach also depend on these factors. Here, we designed a new assembly approach, referred to as the rule-based assembly. This approach follows a similar strategy to the hybrid approach, but employs specific rules learned from certain characteristics of draft contigs to remove any erroneous contigs and then merges them. To construct the most effective rules for this purpose, a learning method based on decision trees, i.e., a complex decision tree, is proposed. Comparative experiments were also conducted to validate the method. The results showed that proposed method could outperformed traditional methods in certain cases.","PeriodicalId":377405,"journal":{"name":"IEICE technical report. Speech","volume":"17 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEICE technical report. Speech","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2197/IPSJTBIO.10.9","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

Various de novo assembly methods based on the concept of k-mer have been proposed. Despite the success of these methods, an alternative approach, referred to as the hybrid approach, has recently been proposed that combines different traditional methods to effectively exploit each of their properties in an integrated manner. However, the results obtained from the traditional methods used in the hybrid approach depend not only on the specific algorithm or heuristics but also on the selection of a user-specific k-mer size. Consequently, the results obtained with the hybrid approach also depend on these factors. Here, we designed a new assembly approach, referred to as the rule-based assembly. This approach follows a similar strategy to the hybrid approach, but employs specific rules learned from certain characteristics of draft contigs to remove any erroneous contigs and then merges them. To construct the most effective rules for this purpose, a learning method based on decision trees, i.e., a complex decision tree, is proposed. Comparative experiments were also conducted to validate the method. The results showed that proposed method could outperformed traditional methods in certain cases.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Rule-based Assembly for Short Read Data Set obtained with Multiple Assemblers and k-mer Sizes (情报论的学习理论と机械学习)
基于k-mer概念的各种从头组装方法已经被提出。尽管这些方法都取得了成功,但最近有人提出了一种替代方法,即混合方法,该方法结合了不同的传统方法,以综合的方式有效地利用了每种方法的特性。然而,混合方法中使用的传统方法获得的结果不仅取决于特定的算法或启发式算法,还取决于用户特定k-mer大小的选择。因此,用混合方法得到的结果也取决于这些因素。在这里,我们设计了一种新的组装方法,称为基于规则的组装。这种方法遵循与混合方法类似的策略,但采用了从草案配置的某些特征中学到的特定规则来删除任何错误的配置,然后合并它们。为了构建最有效的规则,提出了一种基于决策树的学习方法,即复杂决策树。通过对比实验验证了该方法的有效性。结果表明,该方法在某些情况下优于传统方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Toward Affective Speech-to-Speech Translation Matching source code using abstract syntax trees in version control (知能ソフトウェア工学) 足裏ツボ指圧の生理的効果 : 心拍数,自律神経活動,心臓のエントロピーの変化 (MEとバイオサイバネティックス) 特別講演 OCTによる緑内障診断 (医用画像) Rule-based Assembly for Short Read Data Set obtained with Multiple Assemblers and k-mer Sizes (情報論的学習理論と機械学習)
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1