首页 > 最新文献

IPSJ Transactions on Bioinformatics最新文献

英文 中文
Predicting Three-way Interactions of Proteins from Expression Profiles Based on Correlation Coefficient 基于相关系数的表达谱预测蛋白质的三方相互作用
Q3 Biochemistry, Genetics and Molecular Biology Pub Date : 2012-10-01 DOI: 10.2197/IPSJTBIO.5.34
E. Inoue, S. Murakami, T. Fujiki, Takuya Yoshihiro, Atsushi Takemoto, Haruka Ikegami, Kazuya Matsumoto, Masaru Nakagawa
: In this study, we propose a new method to predict three-way interactions among proteins based on corre- lation coe ffi cient of protein expression profiles. Although three-way interactions have not been studied well, this kind of interactions are important to understand the system of life. Previous studies reported the three-way interactions that based on switching mechanisms, in which a property or an expression level of a protein switches the mechanism of interactions between other two proteins. In this paper, we proposed a new method to predict three-way interactions based on the model in which A and B work together to e ff ect on the expression level of C . We present the algorithm to predict the combinations of three proteins that have the three-way interaction, and evaluate it using our real proteome data.
在这项研究中,我们提出了一种基于蛋白表达谱相关系数来预测蛋白间三方相互作用的新方法。虽然三方相互作用还没有得到很好的研究,但这种相互作用对理解生命系统很重要。先前的研究报道了基于开关机制的三方相互作用,其中一个蛋白质的特性或表达水平改变了其他两个蛋白质之间的相互作用机制。在本文中,我们提出了一种基于a和B共同作用影响C表达水平的模型来预测三向相互作用的新方法。我们提出了一种算法来预测具有三向相互作用的三种蛋白质的组合,并使用我们的真实蛋白质组数据对其进行评估。
{"title":"Predicting Three-way Interactions of Proteins from Expression Profiles Based on Correlation Coefficient","authors":"E. Inoue, S. Murakami, T. Fujiki, Takuya Yoshihiro, Atsushi Takemoto, Haruka Ikegami, Kazuya Matsumoto, Masaru Nakagawa","doi":"10.2197/IPSJTBIO.5.34","DOIUrl":"https://doi.org/10.2197/IPSJTBIO.5.34","url":null,"abstract":": In this study, we propose a new method to predict three-way interactions among proteins based on corre- lation coe ffi cient of protein expression profiles. Although three-way interactions have not been studied well, this kind of interactions are important to understand the system of life. Previous studies reported the three-way interactions that based on switching mechanisms, in which a property or an expression level of a protein switches the mechanism of interactions between other two proteins. In this paper, we proposed a new method to predict three-way interactions based on the model in which A and B work together to e ff ect on the expression level of C . We present the algorithm to predict the combinations of three proteins that have the three-way interaction, and evaluate it using our real proteome data.","PeriodicalId":38959,"journal":{"name":"IPSJ Transactions on Bioinformatics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2012-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.2197/IPSJTBIO.5.34","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"68502810","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
DINE: A Novel Score Function for Modeling Multidomain Protein Structures with Domain Linker and Interface Restraints 基于结构域连接子和界面约束的多结构域蛋白质结构建模的一种新的评分函数
Q3 Biochemistry, Genetics and Molecular Biology Pub Date : 2012-04-19 DOI: 10.2197/IPSJTBIO.5.18
Satoru Hirako, M. Shionyu
The functional sites of multidomain proteins are often found at the interfaces of two or more domains. Therefore, the spatial arrangement of the domains is essential in understanding the functional mechanisms of multidomain proteins. However, an experimental determination of the whole structure of a multidomain protein is often difficult due to flexibility in inter-domain arrangement. We have developed a score function, named DINE, to detect probable docking poses generated in a rigid-body docking simulation. This score function takes into account the binding energy, information about the domain interfaces of homologous proteins, and the end-to-end distance spanned by the domain linker. We have examined the performance of DINE on 55 non-redundant known structures of two-domain proteins. In the results, the near-native docking poses were scored within the top 10 in 65.5% of the test cases. DINE scored the near-native poses higher in comparison with an existing domain assembly method, which also used binding energy and linker distance restraints. The results demonstrate that the domain-interface restraints of DINE are quite efficient in selecting near-native domain assemblies.
多结构域蛋白的功能位点通常位于两个或多个结构域的界面上。因此,结构域的空间排列对于理解多结构域蛋白的功能机制至关重要。然而,由于结构域间排列的灵活性,多结构域蛋白的整体结构的实验测定往往是困难的。我们开发了一个名为DINE的评分函数,用于检测刚体对接模拟中产生的可能的对接姿势。该分数函数考虑了结合能、同源蛋白的结构域界面信息以及结构域连接器所跨越的端到端距离。我们已经研究了DINE在55个非冗余的已知双结构域蛋白结构上的性能。结果显示,在65.5%的测试用例中,接近原生的对接姿势得分在前10名之内。与使用结合能和连接子距离限制的现有结构域组装方法相比,DINE对近原生位姿的得分更高。结果表明,DINE的域界面约束在选择近本地域组件方面是非常有效的。
{"title":"DINE: A Novel Score Function for Modeling Multidomain Protein Structures with Domain Linker and Interface Restraints","authors":"Satoru Hirako, M. Shionyu","doi":"10.2197/IPSJTBIO.5.18","DOIUrl":"https://doi.org/10.2197/IPSJTBIO.5.18","url":null,"abstract":"The functional sites of multidomain proteins are often found at the interfaces of two or more domains. Therefore, the spatial arrangement of the domains is essential in understanding the functional mechanisms of multidomain proteins. However, an experimental determination of the whole structure of a multidomain protein is often difficult due to flexibility in inter-domain arrangement. We have developed a score function, named DINE, to detect probable docking poses generated in a rigid-body docking simulation. This score function takes into account the binding energy, information about the domain interfaces of homologous proteins, and the end-to-end distance spanned by the domain linker. We have examined the performance of DINE on 55 non-redundant known structures of two-domain proteins. In the results, the near-native docking poses were scored within the top 10 in 65.5% of the test cases. DINE scored the near-native poses higher in comparison with an existing domain assembly method, which also used binding energy and linker distance restraints. The results demonstrate that the domain-interface restraints of DINE are quite efficient in selecting near-native domain assemblies.","PeriodicalId":38959,"journal":{"name":"IPSJ Transactions on Bioinformatics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2012-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.2197/IPSJTBIO.5.18","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"68502155","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Analysis of Correlation between Gene Expression and Aberrant Epigenetic Status in Alzheimer's Disease Brain 阿尔茨海默病脑基因表达与异常表观遗传状态的相关性分析
Q3 Biochemistry, Genetics and Molecular Biology Pub Date : 2012-02-24 DOI: 10.2197/IPSJTBIO.5.2
K. Yano
Dysregulation of epigenetic mechanisms has been implicated in the pathogenesis of Alzheimer's disease (AD). It has been shown that epigenetic status in promoter regions can alter levels of gene expressions, but their influence on correlated expressions of genes and its dependency on the disease are unclear. Using publicly available microarray and DNA methylation data, this article infer how correlation in gene expression in non-demented (ND) and AD brain may be influenced by genomic promoter methylation. Pearson correlation coefficients of gene expression levels between each of 123 known hypomethylated genes and all other genes in the microarray dataset were calculated, and the mean absolute coefficients were obtained as an overall strength of gene expression correlation of the hypomethylated gene. The distribution of the mean absolute coefficients showed that the hypomethylated genes can be divided into two, by the mean coefficients above or below 0.15. The division of the hypomethylated genes by the mean coefficients was more evident in AD brain than in ND brain. On the other hand, hypermethylated genes had a single dominant group, and the majority of them had the mean coefficient below 0.15. These results suggest that the lower the DNA methylation, the higher the correlation of gene expression levels with the other genes in microarray data. The strength of gene expression correlation was also calculated between known AD risk genes and all other genes in microarray data. It was found that AD risk genes were more likely to have the mean absolute correlation coefficients above 0.15 in AD brain, when the evidence for their association with AD was strong, suggesting the link between DNA methylation and AD. In conclusion DNA methylation status is intimately associated with correlated gene expression, particularly in AD brain.
表观遗传机制的失调与阿尔茨海默病(AD)的发病机制有关。已有研究表明,启动子区域的表观遗传状态可以改变基因表达水平,但它们对相关基因表达的影响及其对疾病的依赖性尚不清楚。利用公开的微阵列和DNA甲基化数据,本文推断非痴呆(ND)和AD大脑中基因表达的相关性如何受到基因组启动子甲基化的影响。计算123个已知低甲基化基因与微阵列数据集中所有其他基因表达水平的Pearson相关系数,并获得平均绝对系数作为该低甲基化基因表达相关性的总体强度。平均绝对系数的分布表明,根据平均系数大于0.15或小于0.15,可以将低甲基化基因分为两类。低甲基化基因在AD脑中的分裂比ND脑中的分裂更明显。另一方面,高甲基化基因有一个单一的优势群,大多数基因的平均系数低于0.15。这些结果表明,DNA甲基化越低,基因表达水平与微阵列数据中其他基因的相关性越高。还计算了已知AD风险基因与微阵列数据中所有其他基因之间的基因表达相关性强度。研究发现,当AD风险基因与AD相关的证据较强时,AD大脑中AD风险基因的平均绝对相关系数更有可能高于0.15,这表明DNA甲基化与AD之间存在联系。总之,DNA甲基化状态与相关基因表达密切相关,尤其是在AD脑中。
{"title":"Analysis of Correlation between Gene Expression and Aberrant Epigenetic Status in Alzheimer's Disease Brain","authors":"K. Yano","doi":"10.2197/IPSJTBIO.5.2","DOIUrl":"https://doi.org/10.2197/IPSJTBIO.5.2","url":null,"abstract":"Dysregulation of epigenetic mechanisms has been implicated in the pathogenesis of Alzheimer's disease (AD). It has been shown that epigenetic status in promoter regions can alter levels of gene expressions, but their influence on correlated expressions of genes and its dependency on the disease are unclear. Using publicly available microarray and DNA methylation data, this article infer how correlation in gene expression in non-demented (ND) and AD brain may be influenced by genomic promoter methylation. Pearson correlation coefficients of gene expression levels between each of 123 known hypomethylated genes and all other genes in the microarray dataset were calculated, and the mean absolute coefficients were obtained as an overall strength of gene expression correlation of the hypomethylated gene. The distribution of the mean absolute coefficients showed that the hypomethylated genes can be divided into two, by the mean coefficients above or below 0.15. The division of the hypomethylated genes by the mean coefficients was more evident in AD brain than in ND brain. On the other hand, hypermethylated genes had a single dominant group, and the majority of them had the mean coefficient below 0.15. These results suggest that the lower the DNA methylation, the higher the correlation of gene expression levels with the other genes in microarray data. The strength of gene expression correlation was also calculated between known AD risk genes and all other genes in microarray data. It was found that AD risk genes were more likely to have the mean absolute correlation coefficients above 0.15 in AD brain, when the evidence for their association with AD was strong, suggesting the link between DNA methylation and AD. In conclusion DNA methylation status is intimately associated with correlated gene expression, particularly in AD brain.","PeriodicalId":38959,"journal":{"name":"IPSJ Transactions on Bioinformatics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2012-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.2197/IPSJTBIO.5.2","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"68502704","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A GPU Accelerated Fragment-based De Novo Ligand Design by a Bayesian Optimization Algorithm 基于贝叶斯优化算法的GPU加速碎片从头配体设计
Q3 Biochemistry, Genetics and Molecular Biology Pub Date : 2012-01-01 DOI: 10.2197/IPSJTBIO.5.7
M. Wahib, Asim Munawar, M. Munetomo, K. Akama
De Novo ligand design is an automatic fragment-based design of molecules within a protein binding site of a known structure. A Bayesian Optimization Algorithm (BOA), a meta-heuristic algorithm, is introduced to join predocked fragments with a user-supplied list of fragments. A novel feature proposed is the simultaneous optimization of force field energy and a term enforcing 3D-overlap to known binding mode(s). The performance of the algorithm is tested on Liver X receptors (LXRs) using a library of about 14, 000 fragments and the binding mode of a known heterocyclic phenyl acetic acid to bias the design. We further introduce the use of GPU (Graphics Processing Unit) to overcome the excessive time required in evaluating each possible fragment combination. We show how the GPU utilization enables experimenting larger fragment sets and target receptors for more complex instances. The results show how the nVidia's Tesla C2050 GPU was utilized to enable the generation of complex agonists effectively. In fact, eight of the 1, 809 molecules designed for LXRs are found in the ZINC database of commercially available compounds.
De Novo配体设计是一种在已知结构的蛋白质结合位点内自动进行基于片段的分子设计。引入了一种元启发式算法——贝叶斯优化算法(BOA),将预先停靠的片段与用户提供的片段列表连接起来。提出的一个新特征是同时优化力场能量和一项强制3d重叠到已知的结合模式。算法的性能在肝脏X受体(LXRs)上进行了测试,使用了大约14000个片段的文库和已知杂环苯乙酸的结合模式来偏倚设计。我们进一步介绍了GPU(图形处理单元)的使用,以克服评估每个可能的碎片组合所需的过多时间。我们展示了GPU利用率如何能够为更复杂的实例实验更大的片段集和目标受体。结果显示了如何利用nVidia的Tesla C2050 GPU有效地生成复杂的激动剂。事实上,1809种为LXRs设计的分子中有8种是在ZINC的商业化合物数据库中发现的。
{"title":"A GPU Accelerated Fragment-based De Novo Ligand Design by a Bayesian Optimization Algorithm","authors":"M. Wahib, Asim Munawar, M. Munetomo, K. Akama","doi":"10.2197/IPSJTBIO.5.7","DOIUrl":"https://doi.org/10.2197/IPSJTBIO.5.7","url":null,"abstract":"De Novo ligand design is an automatic fragment-based design of molecules within a protein binding site of a known structure. A Bayesian Optimization Algorithm (BOA), a meta-heuristic algorithm, is introduced to join predocked fragments with a user-supplied list of fragments. A novel feature proposed is the simultaneous optimization of force field energy and a term enforcing 3D-overlap to known binding mode(s). The performance of the algorithm is tested on Liver X receptors (LXRs) using a library of about 14, 000 fragments and the binding mode of a known heterocyclic phenyl acetic acid to bias the design. We further introduce the use of GPU (Graphics Processing Unit) to overcome the excessive time required in evaluating each possible fragment combination. We show how the GPU utilization enables experimenting larger fragment sets and target receptors for more complex instances. The results show how the nVidia's Tesla C2050 GPU was utilized to enable the generation of complex agonists effectively. In fact, eight of the 1, 809 molecules designed for LXRs are found in the ZINC database of commercially available compounds.","PeriodicalId":38959,"journal":{"name":"IPSJ Transactions on Bioinformatics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2012-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.2197/IPSJTBIO.5.7","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"68502862","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Distance-based Factor Graph Linearization and Sampled Max-sum Algorithm for Efficient 3D Potential Decoding of Macromolecules 基于距离的因子图线性化和采样最大和算法的大分子三维高效电位解码
Q3 Biochemistry, Genetics and Molecular Biology Pub Date : 2011-09-06 DOI: 10.2197/IPSJTBIO.4.34
T. Shinozaki, Toshinao Iwaki, Shiqiao Du, M. Sekijima, S. Furui
Three-dimensional structure prediction of a molecule can be modeled as a minimum energy search problem in a potential landscape. Popular ab initio structure prediction approaches based on this formalization are the Monte Carlo methods represented by the Metropolis method. However, their prediction performance degrades for larger molecules such as proteins since the search space is exponential to the number of atoms. In order to search the exponential space more efficiently, we propose a new method modeling the potential landscape as a factor graph. The key ideas are slicing the factor graph based on the maximum distance of bonded atoms to convert it to a linear structured graph, and the utilization of the max-sum search algorithm combined with samplings. It is referred to as Slice Chain Max-Sum and it has an advantage that the search is efficient because the graph is linear. Experiments are performed using polypeptides having 50 to 300 amino acid residues. It has been shown that the proposed method is computationally more efficient than the Metropolis method for large molecules.
一个分子的三维结构预测可以被建模为一个势能景观中的最小能量搜索问题。基于这种形式化的常用从头开始结构预测方法是以Metropolis方法为代表的蒙特卡罗方法。然而,对于像蛋白质这样的大分子,它们的预测性能会下降,因为搜索空间是原子数量的指数。为了更有效地搜索指数空间,我们提出了一种将潜在景观建模为因子图的新方法。关键思想是根据键合原子的最大距离对因子图进行切片,将其转化为线性结构图,并利用最大和搜索算法结合采样。它被称为切片链最大和,它的优点是搜索效率高,因为图是线性的。实验使用具有50至300个氨基酸残基的多肽进行。结果表明,对于大分子,该方法的计算效率高于Metropolis方法。
{"title":"Distance-based Factor Graph Linearization and Sampled Max-sum Algorithm for Efficient 3D Potential Decoding of Macromolecules","authors":"T. Shinozaki, Toshinao Iwaki, Shiqiao Du, M. Sekijima, S. Furui","doi":"10.2197/IPSJTBIO.4.34","DOIUrl":"https://doi.org/10.2197/IPSJTBIO.4.34","url":null,"abstract":"Three-dimensional structure prediction of a molecule can be modeled as a minimum energy search problem in a potential landscape. Popular ab initio structure prediction approaches based on this formalization are the Monte Carlo methods represented by the Metropolis method. However, their prediction performance degrades for larger molecules such as proteins since the search space is exponential to the number of atoms. In order to search the exponential space more efficiently, we propose a new method modeling the potential landscape as a factor graph. The key ideas are slicing the factor graph based on the maximum distance of bonded atoms to convert it to a linear structured graph, and the utilization of the max-sum search algorithm combined with samplings. It is referred to as Slice Chain Max-Sum and it has an advantage that the search is efficient because the graph is linear. Experiments are performed using polypeptides having 50 to 300 amino acid residues. It has been shown that the proposed method is computationally more efficient than the Metropolis method for large molecules.","PeriodicalId":38959,"journal":{"name":"IPSJ Transactions on Bioinformatics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2011-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.2197/IPSJTBIO.4.34","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"68501989","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Hypothesis Ranking Based on Semantic Event Similarities 基于语义事件相似度的假设排序
Q3 Biochemistry, Genetics and Molecular Biology Pub Date : 2011-01-01 DOI: 10.2197/IPSJTBIO.4.9
Taiki Miyanishi, Kazuhiro Seki, K. Uehara
Accelerated by the technological advances in the biomedical domain, the size of its literature has been growing very rapidly. As a consequence, it is not feasible for individual researchers to comprehend and synthesize all the information related to their interests. Therefore, it is conceivable to discover hidden knowledge, or hypotheses, by linking fragments of information independently described in the literature. In fact, such hypotheses have been reported in the literature mining community; some of which have even been corroborated by experiments. This paper mainly focuses on hypothesis ranking and investigates an approach to identifying reasonable ones based on semantic similarities between events which lead to respective hypotheses. Our assumption is that hypotheses generated from semantically similar events are more reasonable. We developed a prototype system called, Hypothesis Explorer, and conducted evaluative experiments through which the validity of our approach is demonstrated in comparison with those based on term frequencies, often adopted in the previous work.
由于生物医学领域的技术进步,其文献的规模一直在迅速增长。因此,单个研究人员不可能理解和综合与他们的兴趣相关的所有信息。因此,可以想象,通过链接文献中独立描述的信息片段来发现隐藏的知识或假设。事实上,这样的假设已经在文献挖掘界得到了报道;其中一些甚至得到了实验的证实。本文主要研究假设排序问题,研究了一种基于事件之间的语义相似性来识别合理假设的方法。我们的假设是,由语义相似的事件产生的假设更合理。我们开发了一个名为“假设探索者”的原型系统,并进行了评估实验,通过与先前工作中经常采用的基于术语频率的方法进行比较,证明了我们方法的有效性。
{"title":"Hypothesis Ranking Based on Semantic Event Similarities","authors":"Taiki Miyanishi, Kazuhiro Seki, K. Uehara","doi":"10.2197/IPSJTBIO.4.9","DOIUrl":"https://doi.org/10.2197/IPSJTBIO.4.9","url":null,"abstract":"Accelerated by the technological advances in the biomedical domain, the size of its literature has been growing very rapidly. As a consequence, it is not feasible for individual researchers to comprehend and synthesize all the information related to their interests. Therefore, it is conceivable to discover hidden knowledge, or hypotheses, by linking fragments of information independently described in the literature. In fact, such hypotheses have been reported in the literature mining community; some of which have even been corroborated by experiments. This paper mainly focuses on hypothesis ranking and investigates an approach to identifying reasonable ones based on semantic similarities between events which lead to respective hypotheses. Our assumption is that hypotheses generated from semantically similar events are more reasonable. We developed a prototype system called, Hypothesis Explorer, and conducted evaluative experiments through which the validity of our approach is demonstrated in comparison with those based on term frequencies, often adopted in the previous work.","PeriodicalId":38959,"journal":{"name":"IPSJ Transactions on Bioinformatics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2011-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.2197/IPSJTBIO.4.9","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"68502092","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A Web Server for Multi-objective Pairwise RNA Sequence Alignment with an Index for Selecting Accurate Alignments 多目标配对RNA序列比对的Web服务器与选择精确比对的索引
Q3 Biochemistry, Genetics and Molecular Biology Pub Date : 2011-01-01 DOI: 10.2197/IPSJTBIO.4.2
A. Taneda
The importance of non-coding RNAs and their informatics tools has grown for a decade due to a drastic increase of known non-coding RNAs. RNA sequence alignment is one of the most important technologies in such informatics tools. Recently, we have proposed a multi-objective genetic algorithm, Cofolga2mo, for obtaining an approximate set of weak Pareto optimal solutions for global pairwise RNA sequence alignment, where a sequence similarity and a secondary structure contribution are taken into account as objective functions. In the present study, we have developed a web server for obtaining RNA sequence alignments by Cofolga2mo and for assisting the decision making from the alignments. Furthermore, we introduced an index for reducing the number of alignments output by Cofolga2mo. As a result, we successfully reduced the maximum number of alignments for an input RNA sequence pair from fifty to ten without a significant loss of accurate alignments. By using the BRAliBase 2.1 benchmark dataset, we show that a set of alignments output by Cofolga2mo for an input RNA sequence pair, which has at most ten alignments, includes an accurate alignment compared to those of the previous mono-objective RNA sequence alignment programs.
由于已知非编码rna的急剧增加,非编码rna及其信息学工具的重要性已经增长了十年。RNA序列比对是这类信息学工具中最重要的技术之一。最近,我们提出了一种多目标遗传算法Cofolga2mo,用于获得全局成对RNA序列比对的弱Pareto最优解的近似集,其中序列相似性和二级结构贡献作为目标函数。在本研究中,我们开发了一个web服务器,用于通过Cofolga2mo获取RNA序列比对,并协助从比对中做出决策。此外,我们还引入了一个减少Cofolga2mo输出的对齐数量的指标。结果,我们成功地将输入RNA序列对的最大比对数从50个减少到10个,而没有明显的精确比对损失。通过使用BRAliBase 2.1基准数据集,我们发现Cofolga2mo为输入RNA序列对输出的一组比对(最多有10个比对)与之前的单目标RNA序列比对程序相比包含了精确的比对。
{"title":"A Web Server for Multi-objective Pairwise RNA Sequence Alignment with an Index for Selecting Accurate Alignments","authors":"A. Taneda","doi":"10.2197/IPSJTBIO.4.2","DOIUrl":"https://doi.org/10.2197/IPSJTBIO.4.2","url":null,"abstract":"The importance of non-coding RNAs and their informatics tools has grown for a decade due to a drastic increase of known non-coding RNAs. RNA sequence alignment is one of the most important technologies in such informatics tools. Recently, we have proposed a multi-objective genetic algorithm, Cofolga2mo, for obtaining an approximate set of weak Pareto optimal solutions for global pairwise RNA sequence alignment, where a sequence similarity and a secondary structure contribution are taken into account as objective functions. In the present study, we have developed a web server for obtaining RNA sequence alignments by Cofolga2mo and for assisting the decision making from the alignments. Furthermore, we introduced an index for reducing the number of alignments output by Cofolga2mo. As a result, we successfully reduced the maximum number of alignments for an input RNA sequence pair from fifty to ten without a significant loss of accurate alignments. By using the BRAliBase 2.1 benchmark dataset, we show that a set of alignments output by Cofolga2mo for an input RNA sequence pair, which has at most ten alignments, includes an accurate alignment compared to those of the previous mono-objective RNA sequence alignment programs.","PeriodicalId":38959,"journal":{"name":"IPSJ Transactions on Bioinformatics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2011-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.2197/IPSJTBIO.4.2","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"68502345","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Algebraic Approaches to Underdetermined Experiments in Biology 生物学中欠定实验的代数方法
Q3 Biochemistry, Genetics and Molecular Biology Pub Date : 2010-12-01 DOI: 10.2197/IPSJTBIO.3.62
H. Yoshida, Kinji Kimura, Naoki Yoshida, Junko Tanaka, Y. Miwa
We sometimes meet an experiment in which its rate constants cannot be determined in this experiment only; in this case, it is called an underdetermined experiment. One of methods to overcome underdetermination is to combine results of multiple experiments. Multiple experiments give rise to a large number of parameters and variables to analyze, and usually even have a complicated solution with multiple solutions, which situation is unknown to us beforehand. These two difficulties: underdetermination and multiple solutions, lead to confusion as to whether rate constants can intrinsically be determined through experiment or not. In order to analyze such experiments, we use ‘prime ideal decomposition’ to decompose a solution into simpler solutions. It is, however, hard to decompose a set of polynomials with a large number of parameters and variables. Exemplifying a bio-imaging problem, we propose one tip and one technique using ‘resultant’ from a biological viewpoint.
我们有时遇到这样的实验,它的速率常数不能只在这个实验中确定;在这种情况下,它被称为待定实验。克服欠确定的方法之一是将多个实验结果结合起来。多次实验会产生大量需要分析的参数和变量,甚至会有一个包含多个解的复杂解,而这种情况是我们事先不知道的。这两个困难:不确定和多解,导致是否可以通过实验内在地确定速率常数的困惑。为了分析这样的实验,我们使用“素理想分解”将一个解分解成更简单的解。然而,分解一组具有大量参数和变量的多项式是很困难的。举例说明一个生物成像问题,我们从生物学的角度使用“结果”提出一个技巧和一种技术。
{"title":"Algebraic Approaches to Underdetermined Experiments in Biology","authors":"H. Yoshida, Kinji Kimura, Naoki Yoshida, Junko Tanaka, Y. Miwa","doi":"10.2197/IPSJTBIO.3.62","DOIUrl":"https://doi.org/10.2197/IPSJTBIO.3.62","url":null,"abstract":"We sometimes meet an experiment in which its rate constants cannot be determined in this experiment only; in this case, it is called an underdetermined experiment. One of methods to overcome underdetermination is to combine results of multiple experiments. Multiple experiments give rise to a large number of parameters and variables to analyze, and usually even have a complicated solution with multiple solutions, which situation is unknown to us beforehand. These two difficulties: underdetermination and multiple solutions, lead to confusion as to whether rate constants can intrinsically be determined through experiment or not. In order to analyze such experiments, we use ‘prime ideal decomposition’ to decompose a solution into simpler solutions. It is, however, hard to decompose a set of polynomials with a large number of parameters and variables. Exemplifying a bio-imaging problem, we propose one tip and one technique using ‘resultant’ from a biological viewpoint.","PeriodicalId":38959,"journal":{"name":"IPSJ Transactions on Bioinformatics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2010-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.2197/IPSJTBIO.3.62","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"68502247","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Differentially Aberrant Region Detection in Array CGH Data Based on Nearest Neighbor Classification Performance 基于最近邻分类性能的阵列CGH数据差异异常区域检测
Q3 Biochemistry, Genetics and Molecular Biology Pub Date : 2010-10-13 DOI: 10.2197/IPSJTBIO.3.70
Yuta Ishikawa, I. Takeuchi
Array CGH is a useful technology for detecting copy number aberrations in genome-wide scale. We study the problem of detecting differentially aberrant genomic regions in two or more groups of CGH arrays and estimating the statistical significance of those regions. An important property of array CGH data is that there are spatial correlations among probes, and we need to take this fact into consideration when we develop a computational algorithm for array CGH data analysis. In this paper we first discuss three difficult issues underlying this problem, and then introduce nearest-neighbor multivariate test in order to alleviate these difficulties. Our proposed approach has three advantages. First, it can incorporate the spatial correlation among probes. Second, genomic regions with different sizes can be analyzed in a common ground. And finally, the computational cost can be considerably reduced with the use of a simple trick. We demonstrate the effectiveness of our approach through an application to previously published array CGH data set on 75 malignant lymphoma patients.
阵列CGH是一种在全基因组范围内检测拷贝数畸变的有效技术。我们研究了在两组或多组CGH阵列中检测差异异常基因组区域并估计这些区域的统计显著性的问题。阵列CGH数据的一个重要特性是探测器之间存在空间相关性,在开发阵列CGH数据分析的计算算法时需要考虑到这一点。本文首先讨论了这一问题背后的三个难点问题,然后引入了最近邻多元检验来缓解这些困难。我们提出的方法有三个优点。首先,它可以纳入探针之间的空间相关性。其次,不同大小的基因组区域可以在一个共同的基础上进行分析。最后,使用一个简单的技巧可以大大降低计算成本。我们通过对先前发表的75例恶性淋巴瘤患者的阵列CGH数据集的应用证明了我们方法的有效性。
{"title":"Differentially Aberrant Region Detection in Array CGH Data Based on Nearest Neighbor Classification Performance","authors":"Yuta Ishikawa, I. Takeuchi","doi":"10.2197/IPSJTBIO.3.70","DOIUrl":"https://doi.org/10.2197/IPSJTBIO.3.70","url":null,"abstract":"Array CGH is a useful technology for detecting copy number aberrations in genome-wide scale. We study the problem of detecting differentially aberrant genomic regions in two or more groups of CGH arrays and estimating the statistical significance of those regions. An important property of array CGH data is that there are spatial correlations among probes, and we need to take this fact into consideration when we develop a computational algorithm for array CGH data analysis. In this paper we first discuss three difficult issues underlying this problem, and then introduce nearest-neighbor multivariate test in order to alleviate these difficulties. Our proposed approach has three advantages. First, it can incorporate the spatial correlation among probes. Second, genomic regions with different sizes can be analyzed in a common ground. And finally, the computational cost can be considerably reduced with the use of a simple trick. We demonstrate the effectiveness of our approach through an application to previously published array CGH data set on 75 malignant lymphoma patients.","PeriodicalId":38959,"journal":{"name":"IPSJ Transactions on Bioinformatics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2010-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.2197/IPSJTBIO.3.70","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"68502286","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Monte Carlo-based Mouse Nuclear Receptor Superfamily Gene Regulatory Network Prediction: Stochastic Dynamical System on Graph with Zipf Prior 基于蒙特卡罗的小鼠核受体超家族基因调控网络预测:基于Zipf先验图的随机动力系统
Q3 Biochemistry, Genetics and Molecular Biology Pub Date : 2010-03-15 DOI: 10.2197/IPSJTBIO.3.24
Y. Kitamura, Tomomi Kimiwada, J. Maruyama, T. Kaburagi, Takashi Matsumoto, K. Wada
A Monte Carlo based algorithm is proposed to predict gene regulatory network structure of mouse nuclear receptor superfamily, about which little is known although those genes are believed to be related with several difficult diseases. The gene expression data is regarded as sample vector trajectories from a stochastic dynamical system on a graph. The problem is formulated within a Bayesian framework where the graph prior distribution is assumed to follow a Zipf distribution. Appropriateness of a graph is evaluated by the graph posterior mean. The algorithm is implemented with the Exchange Monte Carlo method. After validation against synthesized data, an attempt is made to use the algorithm for predicting network structure of the target, the mouse nuclear receptor superfamily. Several remarks are made on the feasibility of the predicted network from a biological viewpoint.
提出了一种基于蒙特卡罗的算法来预测小鼠核受体超家族的基因调控网络结构,尽管这些基因被认为与几种疑难疾病有关,但对其知之甚少。基因表达数据被看作是随机动力系统在图上的样本矢量轨迹。该问题是在贝叶斯框架内制定的,其中假定图先验分布遵循Zipf分布。图的适当性是用图的后验均值来评价的。该算法采用Exchange蒙特卡洛方法实现。在对合成数据进行验证后,尝试使用该算法预测目标小鼠核受体超家族的网络结构。从生物学的角度对预测网络的可行性作了几点评论。
{"title":"Monte Carlo-based Mouse Nuclear Receptor Superfamily Gene Regulatory Network Prediction: Stochastic Dynamical System on Graph with Zipf Prior","authors":"Y. Kitamura, Tomomi Kimiwada, J. Maruyama, T. Kaburagi, Takashi Matsumoto, K. Wada","doi":"10.2197/IPSJTBIO.3.24","DOIUrl":"https://doi.org/10.2197/IPSJTBIO.3.24","url":null,"abstract":"A Monte Carlo based algorithm is proposed to predict gene regulatory network structure of mouse nuclear receptor superfamily, about which little is known although those genes are believed to be related with several difficult diseases. The gene expression data is regarded as sample vector trajectories from a stochastic dynamical system on a graph. The problem is formulated within a Bayesian framework where the graph prior distribution is assumed to follow a Zipf distribution. Appropriateness of a graph is evaluated by the graph posterior mean. The algorithm is implemented with the Exchange Monte Carlo method. After validation against synthesized data, an attempt is made to use the algorithm for predicting network structure of the target, the mouse nuclear receptor superfamily. Several remarks are made on the feasibility of the predicted network from a biological viewpoint.","PeriodicalId":38959,"journal":{"name":"IPSJ Transactions on Bioinformatics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2010-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.2197/IPSJTBIO.3.24","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"68502132","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
IPSJ Transactions on Bioinformatics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1