Iñaki Amatria-Barral, J. González-Domínguez, J. Touriño
{"title":"并行构建RNA数据库,广泛预测lncRNA-RNA相互作用","authors":"Iñaki Amatria-Barral, J. González-Domínguez, J. Touriño","doi":"10.1145/3555776.3577772","DOIUrl":null,"url":null,"abstract":"Long non-coding RNA sequences (lncRNAs) have completely changed how scientists approach genetics. While some believe that many lncRNAs are results of spurious transcriptions, recent evidence suggests that there exist thousands of them and that they have functions and regulate key biological processes. For the experimental characterization of lncRNAs, many tools that try to predict their interactions with other RNAs have been developed. Some of the fastest and more accurate tools, however, require a slow database construction step prior to the identification of interaction partners for each lncRNA. This paper presents a novel and efficient parallel database construction procedure. Benchmarking results on a 16-node multicore cluster show that our parallel algorithm can build databases up to 318 times faster than other tools in the market using just 256 CPU cores. All the code developed in this work is available to download at GitHub under the MIT License (https://github.com/UDC-GAC/pRIblast).","PeriodicalId":42971,"journal":{"name":"Applied Computing Review","volume":null,"pages":null},"PeriodicalIF":0.4000,"publicationDate":"2023-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Parallel construction of RNA databases for extensive lncRNA-RNA interaction prediction\",\"authors\":\"Iñaki Amatria-Barral, J. González-Domínguez, J. Touriño\",\"doi\":\"10.1145/3555776.3577772\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Long non-coding RNA sequences (lncRNAs) have completely changed how scientists approach genetics. While some believe that many lncRNAs are results of spurious transcriptions, recent evidence suggests that there exist thousands of them and that they have functions and regulate key biological processes. For the experimental characterization of lncRNAs, many tools that try to predict their interactions with other RNAs have been developed. Some of the fastest and more accurate tools, however, require a slow database construction step prior to the identification of interaction partners for each lncRNA. This paper presents a novel and efficient parallel database construction procedure. Benchmarking results on a 16-node multicore cluster show that our parallel algorithm can build databases up to 318 times faster than other tools in the market using just 256 CPU cores. All the code developed in this work is available to download at GitHub under the MIT License (https://github.com/UDC-GAC/pRIblast).\",\"PeriodicalId\":42971,\"journal\":{\"name\":\"Applied Computing Review\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.4000,\"publicationDate\":\"2023-03-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Applied Computing Review\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3555776.3577772\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Computing Review","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3555776.3577772","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
Parallel construction of RNA databases for extensive lncRNA-RNA interaction prediction
Long non-coding RNA sequences (lncRNAs) have completely changed how scientists approach genetics. While some believe that many lncRNAs are results of spurious transcriptions, recent evidence suggests that there exist thousands of them and that they have functions and regulate key biological processes. For the experimental characterization of lncRNAs, many tools that try to predict their interactions with other RNAs have been developed. Some of the fastest and more accurate tools, however, require a slow database construction step prior to the identification of interaction partners for each lncRNA. This paper presents a novel and efficient parallel database construction procedure. Benchmarking results on a 16-node multicore cluster show that our parallel algorithm can build databases up to 318 times faster than other tools in the market using just 256 CPU cores. All the code developed in this work is available to download at GitHub under the MIT License (https://github.com/UDC-GAC/pRIblast).