RLALIGN: A Reinforcement Learning Approach for Multiple Sequence Alignment

2018 IEEE 18th International Conference on Bioinformatics and Bioengineering (BIBE) Pub Date : 2018-10-01 DOI:10.1109/BIBE.2018.00019

R. Ramakrishnan, Jaspal Singh, M. Blanchette

{"title":"RLALIGN: A Reinforcement Learning Approach for Multiple Sequence Alignment","authors":"R. Ramakrishnan, Jaspal Singh, M. Blanchette","doi":"10.1109/BIBE.2018.00019","DOIUrl":null,"url":null,"abstract":"Multiple sequence alignment (MSA) is one of the best studied problems in bioinformatics because of the broad set of genomics, proteomics, and evolutionary analyses that rely on it. Yet the problem is NP-hard and existing heuristics are imperfect. Reinforcement learning (RL) techniques have emerged recently as a potential solution to a wide diversity of computational problems, but have yet to be applied to MSA. In this paper, we describe RLALIGN, a method to solve the MSA problem using RL. RLALIGN is based on Asynchronous Advantage Actor Critic (A3C), a cutting-edge RL framework. Due to the absence of a goal state, however, it required several important modifications. RLALIGN can be trained to accurately align moderate-length sequences, and various heuristics allow it to scale to longer sequences. The accuracy of the alignments produced is on par with, and often better than those of well established alignment algorithms. Overall, our work demonstrates the potential of RL approaches for complex combinatorial problems such as MSA. RLALIGN will prove useful for realignment tasks, where portions of a larger alignment need to be optimized. Unlike classical algorithms, RLALIGN is incognizant to the nature of the scoring scheme, leading to easy generalization to a variety of problem variants.","PeriodicalId":127507,"journal":{"name":"2018 IEEE 18th International Conference on Bioinformatics and Bioengineering (BIBE)","volume":"68 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE 18th International Conference on Bioinformatics and Bioengineering (BIBE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/BIBE.2018.00019","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 8

Abstract

Multiple sequence alignment (MSA) is one of the best studied problems in bioinformatics because of the broad set of genomics, proteomics, and evolutionary analyses that rely on it. Yet the problem is NP-hard and existing heuristics are imperfect. Reinforcement learning (RL) techniques have emerged recently as a potential solution to a wide diversity of computational problems, but have yet to be applied to MSA. In this paper, we describe RLALIGN, a method to solve the MSA problem using RL. RLALIGN is based on Asynchronous Advantage Actor Critic (A3C), a cutting-edge RL framework. Due to the absence of a goal state, however, it required several important modifications. RLALIGN can be trained to accurately align moderate-length sequences, and various heuristics allow it to scale to longer sequences. The accuracy of the alignments produced is on par with, and often better than those of well established alignment algorithms. Overall, our work demonstrates the potential of RL approaches for complex combinatorial problems such as MSA. RLALIGN will prove useful for realignment tasks, where portions of a larger alignment need to be optimized. Unlike classical algorithms, RLALIGN is incognizant to the nature of the scoring scheme, leading to easy generalization to a variety of problem variants.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

RLALIGN:多序列对齐的强化学习方法

多序列比对(MSA)是生物信息学中研究得最好的问题之一，因为广泛的基因组学、蛋白质组学和进化分析都依赖于它。然而问题是np困难，现有的启发式是不完善的。强化学习(RL)技术最近作为解决各种计算问题的潜在解决方案而出现，但尚未应用于MSA。在本文中，我们描述了一种使用强化学习来解决MSA问题的方法RLALIGN。RLALIGN基于异步优势Actor批评家(A3C)，这是一个前沿的RL框架。然而，由于缺乏目标状态，它需要进行一些重要的修改。RLALIGN可以训练成精确地对齐中等长度的序列，各种启发式方法允许它扩展到更长的序列。所产生的对齐精度与已建立的对齐算法相当，并且通常优于这些算法。总的来说，我们的工作证明了强化学习方法在复杂组合问题(如MSA)中的潜力。RLALIGN将被证明对重新排列任务很有用，其中需要对较大对齐的部分进行优化。与经典算法不同的是，RLALIGN无法识别评分方案的本质，因此很容易泛化到各种问题变体。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2018 IEEE 18th International Conference on Bioinformatics and Bioengineering (BIBE)

自引率

0.00%

发文量