首页 > 最新文献

International Journal of Bioinformatics Research and Applications最新文献

英文 中文
Developing a novel test to detect cancer genes from microarray data. 开发一种从微阵列数据中检测癌症基因的新方法。
Q4 Health Professions Pub Date : 2014-01-01 DOI: 10.1504/IJBRA.2014.065246
Shreya Mathur, Sunil Mathur

DNA microarray technology can simultaneously screen thousands of gene expression profiles, transforming how genetics is applied in medicine. However, the lack of normality in microarray data renders common statistical methods ineffective. We propose a novel statistical method which does not require stringent assumptions but is still more powerful than some of its competitors. Using both simulation studies and clinical data, we show that our novel method outperforms previous methods. The limiting distribution for the proposed test is obtained for under null and alternative hypotheses. The proposed test will help make cancer treatment and gene therapy more successful, and it may facilitate research regarding cancer vaccinations. The proposed test may also help in the development of a prediction model in genetic profiling studies built on a subset of differentially expressed genes and the clinical data to assess the accuracy of the clinical prediction.

DNA微阵列技术可以同时筛选数千种基因表达谱,改变遗传学在医学中的应用方式。然而,微阵列数据缺乏正态性使得常用的统计方法无效。我们提出了一种新的统计方法,它不需要严格的假设,但仍然比一些竞争对手更强大。通过模拟研究和临床数据,我们表明我们的新方法优于以前的方法。在零假设和备择假设下,得到了所提出检验的极限分布。拟议中的测试将有助于癌症治疗和基因治疗更加成功,并可能促进有关癌症疫苗的研究。提出的测试也可能有助于在基于差异表达基因子集和临床数据的遗传谱研究中建立预测模型,以评估临床预测的准确性。
{"title":"Developing a novel test to detect cancer genes from microarray data.","authors":"Shreya Mathur,&nbsp;Sunil Mathur","doi":"10.1504/IJBRA.2014.065246","DOIUrl":"https://doi.org/10.1504/IJBRA.2014.065246","url":null,"abstract":"<p><p>DNA microarray technology can simultaneously screen thousands of gene expression profiles, transforming how genetics is applied in medicine. However, the lack of normality in microarray data renders common statistical methods ineffective. We propose a novel statistical method which does not require stringent assumptions but is still more powerful than some of its competitors. Using both simulation studies and clinical data, we show that our novel method outperforms previous methods. The limiting distribution for the proposed test is obtained for under null and alternative hypotheses. The proposed test will help make cancer treatment and gene therapy more successful, and it may facilitate research regarding cancer vaccinations. The proposed test may also help in the development of a prediction model in genetic profiling studies built on a subset of differentially expressed genes and the clinical data to assess the accuracy of the clinical prediction. </p>","PeriodicalId":35444,"journal":{"name":"International Journal of Bioinformatics Research and Applications","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2014-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1504/IJBRA.2014.065246","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"32762772","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Introducing the hypothome: a way to integrate predicted proteins in interactomes. 引入假设:一种在相互作用组中整合预测蛋白质的方法。
Q4 Health Professions Pub Date : 2014-01-01 DOI: 10.1504/IJBRA.2014.065247
Claus Desler, Sine Zambach, Prashanth Suravajhala, Lene Juel Rasmussen

An interactome is defined as a network of protein-protein interactions built from experimentally verified interactions. Basic science as well as application-based research of potential new drugs can be promoted by including proteins that are only predicted into interactomes. The disadvantage of doing so is the risk of devaluing the definition of interactomes. By adding proteins that have only been predicted, an interactome can no longer be classified as experimentally verified and the integrity of the interactome will be endured. Therefore, we propose the term 'hypothome' (collection of hypothetical interactions of predicted proteins). The purpose of such a term is to provide a denotation to the interactome concept allowing the interaction of predicted proteins without devaluing the integrity of the interactome. We define a rule-set for a hypothome and have integrated the predicted protein interaction partners to the hypothetical protein. EAW74251 is an example for the usage of a hypothome.

相互作用组被定义为由实验验证的相互作用建立的蛋白质-蛋白质相互作用网络。通过将仅预测到的蛋白质纳入相互作用组,可以促进基础科学以及基于应用的潜在新药研究。这样做的缺点是有贬低相互作用组定义的风险。通过添加仅被预测的蛋白质,相互作用组不能再被归类为实验验证,相互作用组的完整性将被忍受。因此,我们提出了术语“假设”(假设的相互作用预测蛋白质的集合)。这样一个术语的目的是为相互作用组概念提供一个外延,允许预测的蛋白质相互作用而不降低相互作用组的完整性。我们为假设定义了一个规则集,并将预测的蛋白质相互作用伙伴整合到假设的蛋白质中。EAW74251是使用抵押的一个示例。
{"title":"Introducing the hypothome: a way to integrate predicted proteins in interactomes.","authors":"Claus Desler,&nbsp;Sine Zambach,&nbsp;Prashanth Suravajhala,&nbsp;Lene Juel Rasmussen","doi":"10.1504/IJBRA.2014.065247","DOIUrl":"https://doi.org/10.1504/IJBRA.2014.065247","url":null,"abstract":"<p><p>An interactome is defined as a network of protein-protein interactions built from experimentally verified interactions. Basic science as well as application-based research of potential new drugs can be promoted by including proteins that are only predicted into interactomes. The disadvantage of doing so is the risk of devaluing the definition of interactomes. By adding proteins that have only been predicted, an interactome can no longer be classified as experimentally verified and the integrity of the interactome will be endured. Therefore, we propose the term 'hypothome' (collection of hypothetical interactions of predicted proteins). The purpose of such a term is to provide a denotation to the interactome concept allowing the interaction of predicted proteins without devaluing the integrity of the interactome. We define a rule-set for a hypothome and have integrated the predicted protein interaction partners to the hypothetical protein. EAW74251 is an example for the usage of a hypothome. </p>","PeriodicalId":35444,"journal":{"name":"International Journal of Bioinformatics Research and Applications","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2014-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1504/IJBRA.2014.065247","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"32762773","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Mapping genomic features to functional traits through microbial whole genome sequences. 通过微生物全基因组序列将基因组特征映射到功能性状。
Q4 Health Professions Pub Date : 2014-01-01 DOI: 10.1504/IJBRA.2014.062995
Wei Zhang, Erliang Zeng, Dan Liu, Stuart E Jones, Scott Emrich

Recently, the utility of trait-based approaches for microbial communities has been identified. Increasing availability of whole genome sequences provide the opportunity to explore the genetic foundations of a variety of functional traits. We proposed a machine learning framework to quantitatively link the genomic features with functional traits. Genes from bacteria genomes belonging to different functional traits were grouped to Cluster of Orthologs (COGs), and were used as features. Then, TF-IDF technique from the text mining domain was applied to transform the data to accommodate the abundance and importance of each COG. After TF-IDF processing, COGs were ranked using feature selection methods to identify their relevance to the functional trait of interest. Extensive experimental results demonstrated that functional trait related genes can be detected using our method. Further, the method has the potential to provide novel biological insights.

最近,基于性状的微生物群落研究方法得到了广泛的应用。全基因组序列的不断增加为探索各种功能性状的遗传基础提供了机会。我们提出了一个机器学习框架来定量地将基因组特征与功能性状联系起来。将细菌基因组中属于不同功能性状的基因分组到COGs (Cluster of Orthologs)中作为特征。然后,应用文本挖掘领域的TF-IDF技术对数据进行变换,以适应每个COG的丰富度和重要性。在TF-IDF处理后,使用特征选择方法对cog进行排序,以确定它们与感兴趣的功能性状的相关性。大量的实验结果表明,我们的方法可以检测到功能性状相关基因。此外,该方法具有提供新的生物学见解的潜力。
{"title":"Mapping genomic features to functional traits through microbial whole genome sequences.","authors":"Wei Zhang,&nbsp;Erliang Zeng,&nbsp;Dan Liu,&nbsp;Stuart E Jones,&nbsp;Scott Emrich","doi":"10.1504/IJBRA.2014.062995","DOIUrl":"https://doi.org/10.1504/IJBRA.2014.062995","url":null,"abstract":"<p><p>Recently, the utility of trait-based approaches for microbial communities has been identified. Increasing availability of whole genome sequences provide the opportunity to explore the genetic foundations of a variety of functional traits. We proposed a machine learning framework to quantitatively link the genomic features with functional traits. Genes from bacteria genomes belonging to different functional traits were grouped to Cluster of Orthologs (COGs), and were used as features. Then, TF-IDF technique from the text mining domain was applied to transform the data to accommodate the abundance and importance of each COG. After TF-IDF processing, COGs were ranked using feature selection methods to identify their relevance to the functional trait of interest. Extensive experimental results demonstrated that functional trait related genes can be detected using our method. Further, the method has the potential to provide novel biological insights. </p>","PeriodicalId":35444,"journal":{"name":"International Journal of Bioinformatics Research and Applications","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2014-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1504/IJBRA.2014.062995","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"32476474","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Parameter discovery in stochastic biological models using simulated annealing and statistical model checking. 利用模拟退火和统计模型检查发现随机生物模型中的参数。
Q4 Health Professions Pub Date : 2014-01-01 DOI: 10.1504/IJBRA.2014.062998
Faraz Hussain, Sumit K Jha, Susmit Jha, Christopher J Langmead

Stochastic models are increasingly used to study the behaviour of biochemical systems. While the structure of such models is often readily available from first principles, unknown quantitative features of the model are incorporated into the model as parameters. Algorithmic discovery of parameter values from experimentally observed facts remains a challenge for the computational systems biology community. We present a new parameter discovery algorithm that uses simulated annealing, sequential hypothesis testing, and statistical model checking to learn the parameters in a stochastic model. We apply our technique to a model of glucose and insulin metabolism used for in-silico validation of artificial pancreata and demonstrate its effectiveness by developing parallel CUDA-based implementation for parameter synthesis in this model.

随机模型越来越多地被用于研究生化系统的行为。虽然此类模型的结构往往可以从第一原理中轻易获得,但模型中未知的定量特征会作为参数纳入模型。通过算法从实验观察到的事实中发现参数值仍然是计算系统生物学界面临的一项挑战。我们提出了一种新的参数发现算法,它使用模拟退火、顺序假设检验和统计模型检查来学习随机模型中的参数。我们将这一技术应用于一个葡萄糖和胰岛素代谢模型,该模型用于人工胰腺的实验室内验证,我们还通过开发基于 CUDA 的并行计算实现了该模型的参数合成,证明了这一技术的有效性。
{"title":"Parameter discovery in stochastic biological models using simulated annealing and statistical model checking.","authors":"Faraz Hussain, Sumit K Jha, Susmit Jha, Christopher J Langmead","doi":"10.1504/IJBRA.2014.062998","DOIUrl":"10.1504/IJBRA.2014.062998","url":null,"abstract":"<p><p>Stochastic models are increasingly used to study the behaviour of biochemical systems. While the structure of such models is often readily available from first principles, unknown quantitative features of the model are incorporated into the model as parameters. Algorithmic discovery of parameter values from experimentally observed facts remains a challenge for the computational systems biology community. We present a new parameter discovery algorithm that uses simulated annealing, sequential hypothesis testing, and statistical model checking to learn the parameters in a stochastic model. We apply our technique to a model of glucose and insulin metabolism used for in-silico validation of artificial pancreata and demonstrate its effectiveness by developing parallel CUDA-based implementation for parameter synthesis in this model. </p>","PeriodicalId":35444,"journal":{"name":"International Journal of Bioinformatics Research and Applications","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2014-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4438994/pdf/nihms689333.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"32476477","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
This special issue includes a selection of papers presented at the 2nd IEEE International Conference. Introduction. 本期特刊收录了在第二届IEEE国际会议上发表的论文选集。介绍。
Q4 Health Professions Pub Date : 2014-01-01
Ion Mandoiu, Mihai Pop, Sanguthevar Rajasekaran, John L Spouge
{"title":"This special issue includes a selection of papers presented at the 2nd IEEE International Conference. Introduction.","authors":"Ion Mandoiu,&nbsp;Mihai Pop,&nbsp;Sanguthevar Rajasekaran,&nbsp;John L Spouge","doi":"","DOIUrl":"","url":null,"abstract":"","PeriodicalId":35444,"journal":{"name":"International Journal of Bioinformatics Research and Applications","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2014-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"33083891","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Understanding the importance of natural neuromotor strategy in upper extremity neuroprosthetic control. 了解自然神经运动策略在上肢神经假肢控制中的重要性。
Q4 Health Professions Pub Date : 2014-01-01 DOI: 10.1504/IJBRA.2014.059521
Dominic E Nathan, Robert W Prost, Stephen J Guastello, Dean C Jeutter

A key challenge in upper extremity neuroprosthetics is variable levels of skill and inconsistent functional recovery. We examine the feasibility and benefits of using natural neuromotor strategies through the design and development of a proof-of-concept model for a feed-forward upper extremity neuroprosthetic controller. Developed using Artificial Neural Networks, the model is able to extract and classify neural correlates of movement intention from multiple brain regions that correspond to functional movements. This is unique compared to contemporary controllers that record from limited physiological sources or require learning of new strategies. Functional MRI (fMRI) data from healthy subjects (N = 13) were used to develop the model, and a separate group (N = 4) of subjects were used for validation. Results indicate that the model is able to accurately (81%) predict hand movement strictly from the neural correlates of movement intention. Information from this study is applicable to the development of upper extremity technology aided interventions.

上肢神经修复术的一个关键挑战是技术水平的变化和不稳定的功能恢复。我们通过设计和开发前馈上肢神经假肢控制器的概念验证模型来研究使用自然神经运动策略的可行性和益处。该模型使用人工神经网络开发,能够从与功能运动相对应的多个大脑区域中提取和分类运动意图的神经关联。这是独一无二的,相比之下,当代控制器记录从有限的生理来源或需要学习新的策略。使用健康受试者(N = 13)的功能MRI (fMRI)数据建立模型,并使用另一组(N = 4)受试者进行验证。结果表明,该模型能够准确地(81%)从运动意图的神经关联中严格预测手部运动。本研究的信息适用于上肢技术辅助干预的发展。
{"title":"Understanding the importance of natural neuromotor strategy in upper extremity neuroprosthetic control.","authors":"Dominic E Nathan,&nbsp;Robert W Prost,&nbsp;Stephen J Guastello,&nbsp;Dean C Jeutter","doi":"10.1504/IJBRA.2014.059521","DOIUrl":"https://doi.org/10.1504/IJBRA.2014.059521","url":null,"abstract":"<p><p>A key challenge in upper extremity neuroprosthetics is variable levels of skill and inconsistent functional recovery. We examine the feasibility and benefits of using natural neuromotor strategies through the design and development of a proof-of-concept model for a feed-forward upper extremity neuroprosthetic controller. Developed using Artificial Neural Networks, the model is able to extract and classify neural correlates of movement intention from multiple brain regions that correspond to functional movements. This is unique compared to contemporary controllers that record from limited physiological sources or require learning of new strategies. Functional MRI (fMRI) data from healthy subjects (N = 13) were used to develop the model, and a separate group (N = 4) of subjects were used for validation. Results indicate that the model is able to accurately (81%) predict hand movement strictly from the neural correlates of movement intention. Information from this study is applicable to the development of upper extremity technology aided interventions. </p>","PeriodicalId":35444,"journal":{"name":"International Journal of Bioinformatics Research and Applications","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2014-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1504/IJBRA.2014.059521","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"32170680","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
In silico analysis of plant and animal transposable elements. 植物和动物转座因子的硅分析。
Q4 Health Professions Pub Date : 2014-01-01 DOI: 10.1504/IJBRA.2014.060763
Mo-Li Huang, Songsak Wattanachaisaereekul, Yu-Jun Han, Wanwipa Vongsangnak

Transposable Elements (TEs) play important roles in the evolution of eukaryotic organisms. TEs widely distribute depending on their properties present in the genome. This study elucidated the molecular characteristics of TEs in land plants and animals using bioinformatics and in silico mutational approach. We discovered that the GC-rich class I TEs is the predominant class of TEs in animal, but the AT-rich class II TEs is prevalent in plants. The GC-rich class I TEs appears to be evolved within the animals. On contrary, the preserved in AT-rich in class II TEs is believed to be contributed in host defence systems.

转座因子(te)在真核生物的进化中起着重要的作用。te根据其在基因组中的特性广泛分布。本研究利用生物信息学和芯片突变的方法阐明了陆生动植物TEs的分子特征。我们发现富含gc的I类TEs在动物中占主导地位,而富含at的II类TEs在植物中普遍存在。富含gc的I类TEs似乎是在动物体内进化而来的。相反,保存在富含at的II类te中被认为对宿主防御系统有贡献。
{"title":"In silico analysis of plant and animal transposable elements.","authors":"Mo-Li Huang,&nbsp;Songsak Wattanachaisaereekul,&nbsp;Yu-Jun Han,&nbsp;Wanwipa Vongsangnak","doi":"10.1504/IJBRA.2014.060763","DOIUrl":"https://doi.org/10.1504/IJBRA.2014.060763","url":null,"abstract":"<p><p>Transposable Elements (TEs) play important roles in the evolution of eukaryotic organisms. TEs widely distribute depending on their properties present in the genome. This study elucidated the molecular characteristics of TEs in land plants and animals using bioinformatics and in silico mutational approach. We discovered that the GC-rich class I TEs is the predominant class of TEs in animal, but the AT-rich class II TEs is prevalent in plants. The GC-rich class I TEs appears to be evolved within the animals. On contrary, the preserved in AT-rich in class II TEs is believed to be contributed in host defence systems. </p>","PeriodicalId":35444,"journal":{"name":"International Journal of Bioinformatics Research and Applications","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2014-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1504/IJBRA.2014.060763","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"32312955","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Pairwise sequence alignment for very long sequences on GPUs. gpu上非常长的序列的成对序列对齐。
Q4 Health Professions Pub Date : 2014-01-01 DOI: 10.1504/IJBRA.2014.062989
Junjie Li, Sanjay Ranka, Sartaj Sahni

We develop novel single-GPU parallelisations of the Smith-Waterman algorithm for pairwise sequence alignment. Our algorithms, which are suitable for the alignment of a single pair of very long sequences, can be used to determine the alignment score as well as the actual alignment. Experimental results demonstrate an order of magnitude reduction in run time relative to competing GPU algorithms.

我们开发了新的单gpu并行史密斯-沃特曼算法成对序列对齐。我们的算法适用于单对超长序列的比对,可以用来确定比对得分和实际比对。实验结果表明,相对于竞争的GPU算法,该算法的运行时间减少了一个数量级。
{"title":"Pairwise sequence alignment for very long sequences on GPUs.","authors":"Junjie Li,&nbsp;Sanjay Ranka,&nbsp;Sartaj Sahni","doi":"10.1504/IJBRA.2014.062989","DOIUrl":"https://doi.org/10.1504/IJBRA.2014.062989","url":null,"abstract":"<p><p>We develop novel single-GPU parallelisations of the Smith-Waterman algorithm for pairwise sequence alignment. Our algorithms, which are suitable for the alignment of a single pair of very long sequences, can be used to determine the alignment score as well as the actual alignment. Experimental results demonstrate an order of magnitude reduction in run time relative to competing GPU algorithms. </p>","PeriodicalId":35444,"journal":{"name":"International Journal of Bioinformatics Research and Applications","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2014-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1504/IJBRA.2014.062989","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"32474981","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 29
Scaling up genome annotation using MAKER and work queue. 使用MAKER和工作队列扩展基因组注释。
Q4 Health Professions Pub Date : 2014-01-01 DOI: 10.1504/IJBRA.2014.062994
Andrew Thrasher, Zachary Musgrave, Brian Kachmarck, Douglas Thain, Scott Emrich

Next generation sequencing technologies have enabled sequencing many genomes. Because of the overall increasing demand and the inherent parallelism available in many required analyses, these bioinformatics applications should ideally run on clusters, clouds and/or grids. We present a modified annotation framework that achieves a speed-up of 45x using 50 workers using a Caenorhabditis japonica test case. We also evaluate these modifications within the Amazon EC2 cloud framework. The underlying genome annotation (MAKER) is parallelised as an MPI application. Our framework enables it to now run without MPI while utilising a wide variety of distributed computing resources. This parallel framework also allows easy explicit data transfer, which helps overcome a major limitation of bioinformatics tools that often rely on shared file systems. Combined, our proposed framework can be used, even during early stages of development, to easily run sequence analysis tools on clusters, grids and clouds.

下一代测序技术使许多基因组测序成为可能。由于总体需求的增加和许多所需分析的内在并行性,这些生物信息学应用程序应该理想地运行在集群、云和/或网格上。我们提出了一个修改后的注释框架,使用Caenorhabditis japonica测试用例使用50个worker实现了45倍的加速。我们还在Amazon EC2云框架中评估了这些修改。底层基因组注释(MAKER)作为MPI应用程序并行化。我们的框架使它现在可以在没有MPI的情况下运行,同时利用各种分布式计算资源。这种并行框架还允许简单的显式数据传输,这有助于克服通常依赖于共享文件系统的生物信息学工具的主要限制。结合起来,我们提出的框架可以使用,甚至在开发的早期阶段,很容易在集群、网格和云上运行序列分析工具。
{"title":"Scaling up genome annotation using MAKER and work queue.","authors":"Andrew Thrasher,&nbsp;Zachary Musgrave,&nbsp;Brian Kachmarck,&nbsp;Douglas Thain,&nbsp;Scott Emrich","doi":"10.1504/IJBRA.2014.062994","DOIUrl":"https://doi.org/10.1504/IJBRA.2014.062994","url":null,"abstract":"<p><p>Next generation sequencing technologies have enabled sequencing many genomes. Because of the overall increasing demand and the inherent parallelism available in many required analyses, these bioinformatics applications should ideally run on clusters, clouds and/or grids. We present a modified annotation framework that achieves a speed-up of 45x using 50 workers using a Caenorhabditis japonica test case. We also evaluate these modifications within the Amazon EC2 cloud framework. The underlying genome annotation (MAKER) is parallelised as an MPI application. Our framework enables it to now run without MPI while utilising a wide variety of distributed computing resources. This parallel framework also allows easy explicit data transfer, which helps overcome a major limitation of bioinformatics tools that often rely on shared file systems. Combined, our proposed framework can be used, even during early stages of development, to easily run sequence analysis tools on clusters, grids and clouds. </p>","PeriodicalId":35444,"journal":{"name":"International Journal of Bioinformatics Research and Applications","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2014-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1504/IJBRA.2014.062994","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"32476473","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Discovering non-coding RNA elements in Drosophila 3' untranslated regions. 在果蝇3'非翻译区发现非编码RNA元件。
Q4 Health Professions Pub Date : 2014-01-01 DOI: 10.1504/IJBRA.2014.062996
Cuncong Zhong, Justen Andrews, Shaojie Zhang

The Non-Coding RNA (ncRNA) elements in the 3' Untranslated Regions (3'-UTRs) are known to participate in the genes' post-transcriptional regulations. Inferring co-expression patterns of the genes through clustering these 3'-UTR ncRNA elements will provide invaluable insights for studying their biological functions. In this paper, we propose an improved RNA structural clustering pipeline. Benchmark of the new pipeline on Rfam data demonstrates over 10% performance improvements compared to the traditional hierarchical clustering pipeline. By applying the new clustering pipeline to 3'-UTRs of Drosophila melanogaster's genome, we have successfully identified 184 ncRNA clusters with 91.3% accuracy. One of these clusters corresponds to genes that are preferentially expressed in male Drosophila. Another cluster contains genes that are responsible for the functions of septate junction in epithelial cells. These discoveries encourage more studies on novel post-transcriptional regulation mechanisms.

已知3'非翻译区(3'- utr)中的非编码RNA (ncRNA)元件参与基因的转录后调控。通过聚类这些3'-UTR ncRNA元件推断基因的共表达模式将为研究它们的生物学功能提供宝贵的见解。在本文中,我们提出了一个改进的RNA结构聚类管道。新管道在Rfam数据上的基准测试表明,与传统的分层聚类管道相比,性能提高了10%以上。通过将新的聚类管道应用于果蝇基因组的3'- utr,我们成功地鉴定了184个ncRNA簇,准确率为91.3%。其中一个簇对应于雄性果蝇优先表达的基因。另一簇包含负责上皮细胞分隔连接功能的基因。这些发现鼓励对新的转录后调控机制进行更多的研究。
{"title":"Discovering non-coding RNA elements in Drosophila 3' untranslated regions.","authors":"Cuncong Zhong,&nbsp;Justen Andrews,&nbsp;Shaojie Zhang","doi":"10.1504/IJBRA.2014.062996","DOIUrl":"https://doi.org/10.1504/IJBRA.2014.062996","url":null,"abstract":"<p><p>The Non-Coding RNA (ncRNA) elements in the 3' Untranslated Regions (3'-UTRs) are known to participate in the genes' post-transcriptional regulations. Inferring co-expression patterns of the genes through clustering these 3'-UTR ncRNA elements will provide invaluable insights for studying their biological functions. In this paper, we propose an improved RNA structural clustering pipeline. Benchmark of the new pipeline on Rfam data demonstrates over 10% performance improvements compared to the traditional hierarchical clustering pipeline. By applying the new clustering pipeline to 3'-UTRs of Drosophila melanogaster's genome, we have successfully identified 184 ncRNA clusters with 91.3% accuracy. One of these clusters corresponds to genes that are preferentially expressed in male Drosophila. Another cluster contains genes that are responsible for the functions of septate junction in epithelial cells. These discoveries encourage more studies on novel post-transcriptional regulation mechanisms. </p>","PeriodicalId":35444,"journal":{"name":"International Journal of Bioinformatics Research and Applications","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2014-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1504/IJBRA.2014.062996","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"32476475","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
期刊
International Journal of Bioinformatics Research and Applications
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1