首页 > 最新文献

2010 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)最新文献

英文 中文
An accurate, automatic method for markerless alignment of electron tomographic images 一种精确的、自动的电子层析成像无标记校准方法
Pub Date : 2010-12-01 DOI: 10.1109/BIBM.2010.5706597
Qi Chu, Fa Zhang, Kai Zhang, Xiaohua Wan, Mingwei Chen, Zhiyong Liu
Accurate alignment of electron tomographic images without using embedded gold particles as fiducial markers is still a challenge. Here we propose a new markerless alignment method that employs Scale Invariant Feature Transform features (SIFT) as virtual markers. It differs from other types of feature in a way the sufficient and distinctive information it represents. This characteristic makes the following feature matching and tracking steps automatic and more reliable, which allows for estimating alignment parameters accurately. Furthermore, we use Sparse Bundle Adjustment (SPA) with M-estimation to estimate alignment parameters for each image. Experiments show that our method can achieve a reprojection residual less than 0.4 pixel and can approach the same accuracy of marker alignment. Besides, our method can apply to adjusting typical misalignments such as magnitude divergences or in-plane rotation and can detect bad images.
在不使用嵌入金颗粒作为基准标记的情况下对电子层析图像进行精确对齐仍然是一个挑战。本文提出了一种利用尺度不变特征变换特征(SIFT)作为虚拟标记的无标记对齐方法。它与其他类型的特征的不同之处在于它所代表的充分和独特的信息。这一特性使得以下特征匹配和跟踪步骤自动且更可靠,从而可以准确地估计对准参数。此外,我们使用带有m估计的稀疏束调整(SPA)来估计每个图像的对齐参数。实验结果表明,该方法可以获得小于0.4像素的重投影残差,并可以达到与标记对齐相同的精度。此外,该方法还可用于校正星等差异或平面内旋转等典型的不对准,并能检测出不良图像。
{"title":"An accurate, automatic method for markerless alignment of electron tomographic images","authors":"Qi Chu, Fa Zhang, Kai Zhang, Xiaohua Wan, Mingwei Chen, Zhiyong Liu","doi":"10.1109/BIBM.2010.5706597","DOIUrl":"https://doi.org/10.1109/BIBM.2010.5706597","url":null,"abstract":"Accurate alignment of electron tomographic images without using embedded gold particles as fiducial markers is still a challenge. Here we propose a new markerless alignment method that employs Scale Invariant Feature Transform features (SIFT) as virtual markers. It differs from other types of feature in a way the sufficient and distinctive information it represents. This characteristic makes the following feature matching and tracking steps automatic and more reliable, which allows for estimating alignment parameters accurately. Furthermore, we use Sparse Bundle Adjustment (SPA) with M-estimation to estimate alignment parameters for each image. Experiments show that our method can achieve a reprojection residual less than 0.4 pixel and can approach the same accuracy of marker alignment. Besides, our method can apply to adjusting typical misalignments such as magnitude divergences or in-plane rotation and can detect bad images.","PeriodicalId":275098,"journal":{"name":"2010 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114429305","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Protein-protein interaction prediction using desolvation energies and interface properties 利用脱溶能和界面性质预测蛋白质-蛋白质相互作用
Pub Date : 2010-12-01 DOI: 10.1109/BIBM.2010.5706528
L. Rueda, Sridip Banerjee, Md. Mominul Aziz, Mohammad Raza
An important aspect in understanding and classifying protein-protein interactions (PPI) is to analyze their interfaces in order to distinguish between transient and obligate complexes. We propose a classification approach to discriminate between these two types of complexes. Our approach has two important aspects. First, we have used desolvation energies — amino acid and atom type — of the residues present in the interface, which are the input features of the classifiers. Principal components of the data were found and then the classification is performed via linear dimensionality reduction (LDR) methods. Second, we have investigated various interface properties of these interactions. From the analysis of protein quaternary structures, physicochemical properties are treated as the input features of the classifiers. Various features are extracted from each complex, and the classification is performed via different linear dimensionality reduction (LDR) methods. The results on standard benchmarks of transient and obligate protein complexes show that (i) desolvation energies are better discriminants than solvent accessibility and conservation properties, among others, and (ii) the proposed approach outperforms previous solvent accessible area based approaches using support vector machines.
理解和分类蛋白质-蛋白质相互作用(PPI)的一个重要方面是分析它们的界面,以区分瞬时复合物和专性复合物。我们提出了一种分类方法来区分这两种类型的复合物。我们的方法有两个重要方面。首先,我们使用了界面中存在的残基的脱溶能——氨基酸和原子类型,这是分类器的输入特征。找到数据的主成分,然后通过线性降维(LDR)方法进行分类。其次,我们研究了这些相互作用的各种界面性质。从蛋白质的四级结构分析出发,将理化性质作为分类器的输入特征。从每个复合体中提取各种特征,并通过不同的线性降维(LDR)方法进行分类。在瞬态和专性蛋白质复合物的标准基准上的结果表明:(i)脱溶能比溶剂可及性和守恒性等更好地区分,(ii)所提出的方法优于先前使用支持向量机的基于溶剂可及面积的方法。
{"title":"Protein-protein interaction prediction using desolvation energies and interface properties","authors":"L. Rueda, Sridip Banerjee, Md. Mominul Aziz, Mohammad Raza","doi":"10.1109/BIBM.2010.5706528","DOIUrl":"https://doi.org/10.1109/BIBM.2010.5706528","url":null,"abstract":"An important aspect in understanding and classifying protein-protein interactions (PPI) is to analyze their interfaces in order to distinguish between transient and obligate complexes. We propose a classification approach to discriminate between these two types of complexes. Our approach has two important aspects. First, we have used desolvation energies — amino acid and atom type — of the residues present in the interface, which are the input features of the classifiers. Principal components of the data were found and then the classification is performed via linear dimensionality reduction (LDR) methods. Second, we have investigated various interface properties of these interactions. From the analysis of protein quaternary structures, physicochemical properties are treated as the input features of the classifiers. Various features are extracted from each complex, and the classification is performed via different linear dimensionality reduction (LDR) methods. The results on standard benchmarks of transient and obligate protein complexes show that (i) desolvation energies are better discriminants than solvent accessibility and conservation properties, among others, and (ii) the proposed approach outperforms previous solvent accessible area based approaches using support vector machines.","PeriodicalId":275098,"journal":{"name":"2010 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130236729","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
IsoKEGG: A logic based system for querying biological pathways in KEGG IsoKEGG:一个基于逻辑的系统,用于查询KEGG中的生物途径
Pub Date : 2010-12-01 DOI: 10.1109/BIBM.2010.5706642
Kazi Zakia Sultana, Anupam Bhattacharjee, H. Jamil
Understanding the interaction patterns among a set of biological entities in a pathway is an important exercise because it potentially could reveal the role of the entities in biological systems. Although a considerable amount of effort has been directed to the detection and mining of patterns in biological pathways in contemporary research, querying biological pathways remained relatively unexplored. Querying is principally different in which we retrieve pathways that satisfy a given property in terms of its topology, or constituents. One such property is subnetwork matching using various constituent parameters. In this paper, we introduce a logic based framework for querying biological pathways based on a novel and generic subgraph isomorphism computation technique. We cast this technique into a graphical interface called IsoKEGG to facilitate flexible querying of KEGG pathways. We demonstrate that IsoKEGG is flexible enough to allow querying based on isomorphic pathway topologies as well as matching any combination of node names, types, and edges. It also allows editing KGML represented query pathways and returns all possible pathways in KEGG that satisfy a given query condition that the users are able to investigate further.
了解通路中一组生物实体之间的相互作用模式是一项重要的工作,因为它可能揭示这些实体在生物系统中的作用。尽管在当代研究中,相当多的努力已经被用于检测和挖掘生物途径的模式,但对生物途径的查询仍然相对未被探索。查询在本质上是不同的,我们检索满足给定属性的拓扑或成分的路径。其中一个属性是使用各种组成参数的子网匹配。本文介绍了一种基于逻辑的生物路径查询框架,该框架基于一种新的、通用的子图同构计算技术。我们将该技术转换为一个称为IsoKEGG的图形界面,以方便灵活地查询KEGG路径。我们证明了IsoKEGG足够灵活,可以基于同构路径拓扑进行查询,也可以匹配节点名称、类型和边的任何组合。它还允许编辑KGML表示的查询路径,并返回KEGG中满足给定查询条件的所有可能路径,以便用户能够进一步调查。
{"title":"IsoKEGG: A logic based system for querying biological pathways in KEGG","authors":"Kazi Zakia Sultana, Anupam Bhattacharjee, H. Jamil","doi":"10.1109/BIBM.2010.5706642","DOIUrl":"https://doi.org/10.1109/BIBM.2010.5706642","url":null,"abstract":"Understanding the interaction patterns among a set of biological entities in a pathway is an important exercise because it potentially could reveal the role of the entities in biological systems. Although a considerable amount of effort has been directed to the detection and mining of patterns in biological pathways in contemporary research, querying biological pathways remained relatively unexplored. Querying is principally different in which we retrieve pathways that satisfy a given property in terms of its topology, or constituents. One such property is subnetwork matching using various constituent parameters. In this paper, we introduce a logic based framework for querying biological pathways based on a novel and generic subgraph isomorphism computation technique. We cast this technique into a graphical interface called IsoKEGG to facilitate flexible querying of KEGG pathways. We demonstrate that IsoKEGG is flexible enough to allow querying based on isomorphic pathway topologies as well as matching any combination of node names, types, and edges. It also allows editing KGML represented query pathways and returns all possible pathways in KEGG that satisfy a given query condition that the users are able to investigate further.","PeriodicalId":275098,"journal":{"name":"2010 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125331086","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Classification of genome-wide copy number variations and their associated SNP and gene networks analysis 全基因组拷贝数变异的分类及其相关SNP和基因网络分析
Pub Date : 2010-12-01 DOI: 10.1109/BIBM.2010.5706526
Yang Liu, Yiu-Fai Lee, M. Ng
Detection of genomic DNA copy number variations (CNVs) can provide a complete and more comprehensive view of human disease. In this paper, we incorporate DNA copy number variation data derived from SNP arrays into a computational shrunken model and formalize the detection of copy number variations as a case-control classification problem. By shrinkage, the number of relevant CNVs to disease can be determined. In order to understand relevant CNVs, we study their corresponding SNPs in the genome and find out the unique genes that those SNPs are located in. A gene-gene similarity value is computed using GOSemSim and gene pairs that has a similarity value being greater than a threshold are selected to construct several groups of genes. For the SNPs that involved in these groups of genes, a statistical software PLINK is employed to compute the pair-wise SNP-SNP interactions, and identify SNP networks based on their p-values. By using two real genome-wide data sets, we further demonstrate SNP and gene networks play a role in the biological process. An analysis shows that such networks have relationships directly or indirectly to disease study.
基因组DNA拷贝数变异(CNVs)的检测可以提供一个完整和更全面的人类疾病视图。在本文中,我们将来自SNP阵列的DNA拷贝数变化数据纳入计算萎缩模型,并将拷贝数变化的检测形式化为病例对照分类问题。通过收缩,可以确定与疾病相关的CNVs数量。为了了解相关的CNVs,我们研究了它们在基因组中对应的SNPs,并找出这些SNPs所在的独特基因。使用GOSemSim计算基因-基因相似值,并选择相似值大于阈值的基因对构建多组基因。对于这些基因组中涉及的SNP,使用PLINK统计软件计算成对SNP-SNP相互作用,并根据其p值识别SNP网络。通过使用两个真实的全基因组数据集,我们进一步证明了SNP和基因网络在生物过程中发挥作用。一项分析表明,这种网络与疾病研究有直接或间接的关系。
{"title":"Classification of genome-wide copy number variations and their associated SNP and gene networks analysis","authors":"Yang Liu, Yiu-Fai Lee, M. Ng","doi":"10.1109/BIBM.2010.5706526","DOIUrl":"https://doi.org/10.1109/BIBM.2010.5706526","url":null,"abstract":"Detection of genomic DNA copy number variations (CNVs) can provide a complete and more comprehensive view of human disease. In this paper, we incorporate DNA copy number variation data derived from SNP arrays into a computational shrunken model and formalize the detection of copy number variations as a case-control classification problem. By shrinkage, the number of relevant CNVs to disease can be determined. In order to understand relevant CNVs, we study their corresponding SNPs in the genome and find out the unique genes that those SNPs are located in. A gene-gene similarity value is computed using GOSemSim and gene pairs that has a similarity value being greater than a threshold are selected to construct several groups of genes. For the SNPs that involved in these groups of genes, a statistical software PLINK is employed to compute the pair-wise SNP-SNP interactions, and identify SNP networks based on their p-values. By using two real genome-wide data sets, we further demonstrate SNP and gene networks play a role in the biological process. An analysis shows that such networks have relationships directly or indirectly to disease study.","PeriodicalId":275098,"journal":{"name":"2010 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"102 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116125542","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A generalized sequence pattern matching algorithm using complementary dual-seeding 一种基于互补双播种的广义序列模式匹配算法
Pub Date : 2010-12-01 DOI: 10.1109/BIBM.2010.5706593
Bing Ni, Leung-Yau Lo, K. Leung
In this work, we define generalized (sequence) patterns, which is based on several real Biological problems, including transcription factors (TFs) binding to transcription factor binding sites (TFBSs), cis-regulatory modules, protein domain analysis, and alternative splicing etc. Simply speaking, a generalized pattern is composed of several substrings with gaps in-between two substrings. We propose a generalized pattern matching algorithm that uses a complementary dualseeding strategy, which is sensitive to errors (both mismatches and indels). We also develop a generalized pattern matching tool1, which is to our knowledge the first ever developed specially for generalized pattern matching. Rather than replacing the existing general purpose matching tools, such as BLAST, BLAT, and PatternHunter etc, our tool provides an alternative and helps users to solve real problems, especially those that can be modeled as generalized patterns. We use data randomly sampled from reference sequences of human genome (NCBI build v18) in experiments, and hit 98.74% generalized patterns on average. The tool runs on both LINUX and Windows platforms, and the memory peak goes to a little bit larger than 1GB only.
在这项工作中,我们定义了广义(序列)模式,这是基于几个实际的生物学问题,包括转录因子(tffs)结合转录因子结合位点(TFBSs),顺式调控模块,蛋白质结构域分析,和选择性剪接等。简单地说,一个广义模式是由几个子字符串组成的,两个子字符串之间有间隙。我们提出了一种使用互补双播策略的广义模式匹配算法,该算法对错误(不匹配和索引)都很敏感。我们还开发了一个广义模式匹配工具1,据我们所知,这是第一个专门为广义模式匹配开发的工具。我们的工具不是取代现有的通用匹配工具,如BLAST、BLAT和PatternHunter等,而是提供了一种替代方案,帮助用户解决实际问题,特别是那些可以建模为通用模式的问题。在实验中,我们从人类基因组参考序列(NCBI build v18)中随机抽取数据,平均达到98.74%的广义模式。该工具可以在LINUX和Windows平台上运行,内存峰值仅略大于1GB。
{"title":"A generalized sequence pattern matching algorithm using complementary dual-seeding","authors":"Bing Ni, Leung-Yau Lo, K. Leung","doi":"10.1109/BIBM.2010.5706593","DOIUrl":"https://doi.org/10.1109/BIBM.2010.5706593","url":null,"abstract":"In this work, we define generalized (sequence) patterns, which is based on several real Biological problems, including transcription factors (TFs) binding to transcription factor binding sites (TFBSs), cis-regulatory modules, protein domain analysis, and alternative splicing etc. Simply speaking, a generalized pattern is composed of several substrings with gaps in-between two substrings. We propose a generalized pattern matching algorithm that uses a complementary dualseeding strategy, which is sensitive to errors (both mismatches and indels). We also develop a generalized pattern matching tool1, which is to our knowledge the first ever developed specially for generalized pattern matching. Rather than replacing the existing general purpose matching tools, such as BLAST, BLAT, and PatternHunter etc, our tool provides an alternative and helps users to solve real problems, especially those that can be modeled as generalized patterns. We use data randomly sampled from reference sequences of human genome (NCBI build v18) in experiments, and hit 98.74% generalized patterns on average. The tool runs on both LINUX and Windows platforms, and the memory peak goes to a little bit larger than 1GB only.","PeriodicalId":275098,"journal":{"name":"2010 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"89 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125001716","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
MiRNAs as promising phylogenetic markers for inferring deep metazoan phylogeny and in support of Olfactores hypothesis mirna作为推测深层后生动物系统发育和支持嗅觉假说的有前途的系统发育标记
Pub Date : 2010-12-01 DOI: 10.1109/BIBM.2010.5706545
Q. Cai, Xiaoyan Zhang, Zuofeng Li
The Long Branch attraction (LBA) artefact induced by fast evolving Urochordata had hindered the interpretation of relationships among 3 subphyla of Chordata. Although Olfactores hypothesis which placed Urochordata rather than Cephalochordata as the closest relatives to Craniata was gradually accepted, every step of phylogenetic reconstruction had to be treated prudential to minimize LBA phenomenon. MiRNAs (microRNAs) are well known for their 1) adherence to organism development, 2) high conservation, and 3) rarity of secondary loss, parallel evolution, and convergence among metazoan. Therefore we suppose miRNAs to be promising candidates to dispel LBA phenomenon. We performed a phylogenetic study upon 35 pre-miRNA datasets and reconstruct Chordata phylogeny which supported Olfactores hypothesis in a more toilless way by applying fewer datasets and unspecified substitution model. This is the first attempt to apply miRNA sequences in interpreting Chordata phylogeny, and we reckon miRNAs as promising phylogenetic markers for illuminating deuterostome evolution.
快速进化的尾脊索动物诱导的长分支吸引(LBA)伪影阻碍了脊索动物3个亚门间关系的解释。虽然逐渐接受了将尾脊索目而不是头脊索目作为颅目近亲的Olfactores假说,但系统发育重建的每一步都必须谨慎对待,以尽量减少LBA现象。MiRNAs (microRNAs)以其以下特点而闻名:1)与生物体发育的粘附性;2)高度的保守性;3)罕见的次生损失、平行进化和后生动物间的趋同。因此,我们认为mirna是消除LBA现象的有希望的候选者。我们对35个pre-miRNA数据集进行了系统发育研究,并利用较少的数据集和未指定的替代模型,以更有效的方式重建了支持Olfactores假说的脊索动物系统发育。这是应用miRNA序列解释脊索动物系统发育的第一次尝试,我们认为miRNA是阐明后口动物进化的有前途的系统发育标记。
{"title":"MiRNAs as promising phylogenetic markers for inferring deep metazoan phylogeny and in support of Olfactores hypothesis","authors":"Q. Cai, Xiaoyan Zhang, Zuofeng Li","doi":"10.1109/BIBM.2010.5706545","DOIUrl":"https://doi.org/10.1109/BIBM.2010.5706545","url":null,"abstract":"The Long Branch attraction (LBA) artefact induced by fast evolving Urochordata had hindered the interpretation of relationships among 3 subphyla of Chordata. Although Olfactores hypothesis which placed Urochordata rather than Cephalochordata as the closest relatives to Craniata was gradually accepted, every step of phylogenetic reconstruction had to be treated prudential to minimize LBA phenomenon. MiRNAs (microRNAs) are well known for their 1) adherence to organism development, 2) high conservation, and 3) rarity of secondary loss, parallel evolution, and convergence among metazoan. Therefore we suppose miRNAs to be promising candidates to dispel LBA phenomenon. We performed a phylogenetic study upon 35 pre-miRNA datasets and reconstruct Chordata phylogeny which supported Olfactores hypothesis in a more toilless way by applying fewer datasets and unspecified substitution model. This is the first attempt to apply miRNA sequences in interpreting Chordata phylogeny, and we reckon miRNAs as promising phylogenetic markers for illuminating deuterostome evolution.","PeriodicalId":275098,"journal":{"name":"2010 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125277632","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Prediction of DNA-binding protein based on alpha shape modeling 基于α形状模型的dna结合蛋白预测
Pub Date : 2010-12-01 DOI: 10.1109/BIBM.2010.5706529
Weiqiang Zhou, Hong Yan
Previous studies about protein-DNA interaction focused on the bound structure of DNA-binding proteins and provided good but not practical results. In our work, we apply an alpha shape model to represent the surface structure of the protein-DNA complex and use structural alignment to develop an interface-atom curvature-dependent conditional probability discriminatory function for the prediction of unbound DNA-binding protein. The proposed method provides good performance in predicting unbound structure of DNA-binding protein which is potentially useful in many fields. Computer experiment results show that the curvature-dependent formalism with the optimal parameters can achieve sensitivity ranges from 48.08% to 44.23% and specificity ranges from 73.82% to 84.29%.
以往关于蛋白质- dna相互作用的研究主要集中在dna结合蛋白的结合结构上,取得了较好的但不实用的结果。在我们的工作中,我们应用α形状模型来表示蛋白质- dna复合物的表面结构,并使用结构比对来开发界面原子曲率依赖的条件概率判别函数,用于预测未结合的dna结合蛋白。该方法在预测dna结合蛋白的非结合结构方面具有良好的性能,在许多领域具有潜在的应用价值。计算机实验结果表明,采用最优参数的曲率相关形式可以实现48.08% ~ 44.23%的灵敏度和73.82% ~ 84.29%的特异度。
{"title":"Prediction of DNA-binding protein based on alpha shape modeling","authors":"Weiqiang Zhou, Hong Yan","doi":"10.1109/BIBM.2010.5706529","DOIUrl":"https://doi.org/10.1109/BIBM.2010.5706529","url":null,"abstract":"Previous studies about protein-DNA interaction focused on the bound structure of DNA-binding proteins and provided good but not practical results. In our work, we apply an alpha shape model to represent the surface structure of the protein-DNA complex and use structural alignment to develop an interface-atom curvature-dependent conditional probability discriminatory function for the prediction of unbound DNA-binding protein. The proposed method provides good performance in predicting unbound structure of DNA-binding protein which is potentially useful in many fields. Computer experiment results show that the curvature-dependent formalism with the optimal parameters can achieve sensitivity ranges from 48.08% to 44.23% and specificity ranges from 73.82% to 84.29%.","PeriodicalId":275098,"journal":{"name":"2010 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"637 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132971390","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Fuzzy C-means method with empirical mode decomposition for clustering microarray data 基于经验模态分解的模糊c均值聚类方法
Pub Date : 2010-12-01 DOI: 10.1109/BIBM.2010.5706561
Yanfei Wang, Zuguo Yu, V. Anh
Microarray techniques have revolutionized genomic research by making it possible to monitor the expression of thousands of genes in parallel. Data clustering analysis has been extensively applied to extract information from gene expression profiles obtained with DNA microarrays. Existing clustering approaches, mainly developed in computer science, have been adapted to microarray data. Among these approaches, fuzzy C-means (FCM) method is an efficient one. However, microarray data contains noise and the noise would affect clustering results. Some clustering structure still can be found from random data without any biological significance. In this paper, we propose to combine the FCM method with the empirical mode decomposition (EMD) for clustering microarray data in order to reduce the effect of the noise. We call this method fuzzy C-means method with empirical mode decomposition (FCM-EMD). Using the FCM-EMD method on gene microarray data, we obtained better results than those using FCM only. The results suggest the clustering structures of denoised data are more reasonable and genes have tighter association with their clusters. Denoised gene data without any biological information contains no cluster structure. We find that we can avoid estimating the fuzzy parameter m in some degree by analyzing denoised microarray data. This makes clustering more efficient. Using the FCM-EMD method to analyze gene microarray data can save time and obtain more reasonable results.
微阵列技术使同时监测数千个基因的表达成为可能,从而使基因组研究发生了革命性的变化。数据聚类分析已广泛应用于从DNA微阵列获得的基因表达谱中提取信息。现有的聚类方法,主要是在计算机科学中发展起来的,已经适应了微阵列数据。其中,模糊c均值(FCM)方法是一种有效的方法。然而,微阵列数据中存在噪声,噪声会影响聚类结果。从随机数据中仍然可以发现一些没有任何生物学意义的聚类结构。在本文中,我们提出将FCM方法与经验模态分解(EMD)相结合用于微阵列数据聚类,以降低噪声的影响。我们称这种方法为经验模态分解模糊c均值方法(FCM-EMD)。利用FCM- emd方法对基因微阵列数据进行分析,得到了比仅使用FCM方法更好的结果。结果表明,去噪后的数据聚类结构更加合理,基因与聚类的关联更加紧密。不含任何生物信息的去噪基因数据不包含聚类结构。通过分析去噪后的微阵列数据,可以在一定程度上避免模糊参数m的估计。这使得集群更加高效。采用FCM-EMD方法分析基因微阵列数据可以节省时间,得到更合理的结果。
{"title":"Fuzzy C-means method with empirical mode decomposition for clustering microarray data","authors":"Yanfei Wang, Zuguo Yu, V. Anh","doi":"10.1109/BIBM.2010.5706561","DOIUrl":"https://doi.org/10.1109/BIBM.2010.5706561","url":null,"abstract":"Microarray techniques have revolutionized genomic research by making it possible to monitor the expression of thousands of genes in parallel. Data clustering analysis has been extensively applied to extract information from gene expression profiles obtained with DNA microarrays. Existing clustering approaches, mainly developed in computer science, have been adapted to microarray data. Among these approaches, fuzzy C-means (FCM) method is an efficient one. However, microarray data contains noise and the noise would affect clustering results. Some clustering structure still can be found from random data without any biological significance. In this paper, we propose to combine the FCM method with the empirical mode decomposition (EMD) for clustering microarray data in order to reduce the effect of the noise. We call this method fuzzy C-means method with empirical mode decomposition (FCM-EMD). Using the FCM-EMD method on gene microarray data, we obtained better results than those using FCM only. The results suggest the clustering structures of denoised data are more reasonable and genes have tighter association with their clusters. Denoised gene data without any biological information contains no cluster structure. We find that we can avoid estimating the fuzzy parameter m in some degree by analyzing denoised microarray data. This makes clustering more efficient. Using the FCM-EMD method to analyze gene microarray data can save time and obtain more reasonable results.","PeriodicalId":275098,"journal":{"name":"2010 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132080575","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
A procedure for identifying master regulators in conjunction with network screening and inference 与网络筛选和推理一起确定主调节器的程序
Pub Date : 2010-12-01 DOI: 10.1109/BIBM.2010.5706580
Shigeru Saito, Xinrong Zhou, Taejeong Bae, Sunghoon Kim, K. Horimoto
We developed a procedure for indentifying transcriptional master regulators (MRs) related to special biological phenomena, such as diseases, in conjunction with network screening and inference. Network screening is a system for detecting activated transcriptional regulatory networks under particular conditions, based on the estimation of the graph structure consistency with the measured data. Since the network screening utilizes the known transcriptional factor (TF)-gene relationships as the experimental evidence for the molecular relationships, its performance depends on the ensemble of known TF networks used for its analysis. To compensate for its restrictions, a network inference method, the path consistency algorithm, is concomitantly utilized to identify MRs. The performance is illustrated by means of the known MRs in brain tumors that were computationally inferred and experimentally verified. As a result, the present procedure worked well for identifying MRs, in comparison to the previous computational selection for experimental verification.
我们开发了一种程序,用于识别与特殊生物现象(如疾病)相关的转录主调控因子(MRs),并结合网络筛选和推理。网络筛选是一种在特定条件下检测激活的转录调控网络的系统,基于对图结构与测量数据一致性的估计。由于网络筛选利用已知的转录因子(TF)-基因关系作为分子关系的实验证据,因此其性能取决于用于分析的已知TF网络的集合。为了弥补其局限性,同时利用网络推理方法,即路径一致性算法,来识别MRs。通过计算推断和实验验证的脑肿瘤中已知的MRs来说明性能。结果,与之前的实验验证的计算选择相比,本程序在识别MRs方面工作得很好。
{"title":"A procedure for identifying master regulators in conjunction with network screening and inference","authors":"Shigeru Saito, Xinrong Zhou, Taejeong Bae, Sunghoon Kim, K. Horimoto","doi":"10.1109/BIBM.2010.5706580","DOIUrl":"https://doi.org/10.1109/BIBM.2010.5706580","url":null,"abstract":"We developed a procedure for indentifying transcriptional master regulators (MRs) related to special biological phenomena, such as diseases, in conjunction with network screening and inference. Network screening is a system for detecting activated transcriptional regulatory networks under particular conditions, based on the estimation of the graph structure consistency with the measured data. Since the network screening utilizes the known transcriptional factor (TF)-gene relationships as the experimental evidence for the molecular relationships, its performance depends on the ensemble of known TF networks used for its analysis. To compensate for its restrictions, a network inference method, the path consistency algorithm, is concomitantly utilized to identify MRs. The performance is illustrated by means of the known MRs in brain tumors that were computationally inferred and experimentally verified. As a result, the present procedure worked well for identifying MRs, in comparison to the previous computational selection for experimental verification.","PeriodicalId":275098,"journal":{"name":"2010 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134269353","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Gene expression rule discovery with a multi-objective neural-genetic hybrid 基于多目标神经遗传杂交的基因表达规则发现
Pub Date : 2010-12-01 DOI: 10.1109/BIBM.2010.5706646
E. Keedwell, A. Narayanan
Recent advances in microarray technology allow an unprecedented view of the biochemical mechanisms contained within a cell. Deriving useful information from the data is still proving to be a difficult task. In this paper a novel method based on a multi-objective genetic algorithm that discovers relevant sets of genes and uses a neural network to create rules using the evolved genes is described. This hybrid method is shown to work on four well-established gene expression datasets taken from the literature. The results indicate that the approach can return biologically intelligible as well as plausible results. The proposed method requires no pre-filtering or preselection of genes.
微阵列技术的最新进展使人们对细胞内的生化机制有了前所未有的了解。从数据中得出有用的信息仍然是一项艰巨的任务。本文描述了一种基于多目标遗传算法的新方法,该方法发现相关的基因集,并利用进化的基因使用神经网络创建规则。这种混合方法被证明可以在从文献中提取的四个成熟的基因表达数据集上工作。结果表明,该方法可以返回生物学上可理解的结果以及可信的结果。该方法不需要预先过滤或预先选择基因。
{"title":"Gene expression rule discovery with a multi-objective neural-genetic hybrid","authors":"E. Keedwell, A. Narayanan","doi":"10.1109/BIBM.2010.5706646","DOIUrl":"https://doi.org/10.1109/BIBM.2010.5706646","url":null,"abstract":"Recent advances in microarray technology allow an unprecedented view of the biochemical mechanisms contained within a cell. Deriving useful information from the data is still proving to be a difficult task. In this paper a novel method based on a multi-objective genetic algorithm that discovers relevant sets of genes and uses a neural network to create rules using the evolved genes is described. This hybrid method is shown to work on four well-established gene expression datasets taken from the literature. The results indicate that the approach can return biologically intelligible as well as plausible results. The proposed method requires no pre-filtering or preselection of genes.","PeriodicalId":275098,"journal":{"name":"2010 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129277712","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
2010 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1