首页 > 最新文献

2010 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)最新文献

英文 中文
Improving robustness of gene ranking by resampling and permutation based score correction and normalization 通过重采样和基于排列的分数校正和归一化提高基因排序的稳健性
Pub Date : 2010-12-01 DOI: 10.1109/BIBM.2010.5706607
Feng Yang, K. Mao
Feature ranking, which ranks features via their individual importance, is one of the frequently used feature selection techniques. Traditional feature ranking criteria are apt to produce inconsistent ranking results even with light perturbations in training samples when applied to high dimensional and small-sized gene expression data. A widely used strategy for solving the inconsistencies is the multi-criterion combination. But one problem encountered in combining multiple criteria is the score normalization. In this paper, problems in existing methods are first analyzed, and a new gene importance transformation algorithm is then proposed. Experimental studies on three popular gene expression datasets show that the multi-criterion combination based on the proposed score correction and normalization produces gene rankings with improved robustness.
特征排序是一种常用的特征选择技术,通过特征的重要性对特征进行排序。传统的特征排序标准在处理高维、小尺寸的基因表达数据时,即使在训练样本中有轻微的扰动,也容易产生不一致的排序结果。一种广泛使用的解决不一致性的策略是多准则组合。但是,在组合多个标准时遇到的一个问题是得分归一化。本文首先分析了现有方法存在的问题,提出了一种新的基因重要度变换算法。对三个流行的基因表达数据集的实验研究表明,基于所提出的分数校正和归一化的多准则组合产生的基因排序具有更好的鲁棒性。
{"title":"Improving robustness of gene ranking by resampling and permutation based score correction and normalization","authors":"Feng Yang, K. Mao","doi":"10.1109/BIBM.2010.5706607","DOIUrl":"https://doi.org/10.1109/BIBM.2010.5706607","url":null,"abstract":"Feature ranking, which ranks features via their individual importance, is one of the frequently used feature selection techniques. Traditional feature ranking criteria are apt to produce inconsistent ranking results even with light perturbations in training samples when applied to high dimensional and small-sized gene expression data. A widely used strategy for solving the inconsistencies is the multi-criterion combination. But one problem encountered in combining multiple criteria is the score normalization. In this paper, problems in existing methods are first analyzed, and a new gene importance transformation algorithm is then proposed. Experimental studies on three popular gene expression datasets show that the multi-criterion combination based on the proposed score correction and normalization produces gene rankings with improved robustness.","PeriodicalId":275098,"journal":{"name":"2010 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"64 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127649056","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Improved mammographic mass retrieval performance using multi-view information 利用多视图信息改进乳房x线影像质量检索性能
Pub Date : 2010-12-01 DOI: 10.1109/BIBM.2010.5706601
Wei Liu, Weidong Xu, Lihua Li, Shuang Li, Huanping Zhao, Juan Zhang
Breast cancer is the most common malignant disease in women. Mammographic mass retrieval system can help radiologists to improve the diagnostic accuracy by retrieving biopsy-proven masses which are similar with the diagnostic ones. However, although screening mammograms usually consists of two-view(MLO and CC) mammography of the same breast, most breast CAD systems incorporate with image retrieval techniques are based on a single-view principle where query ROI within a view is analyzed independently. In this paper, a mammographic mass retrieval approach based on multi-view information is proposed. In this work, the query example is a multi-view(MLO and CC) mass pair instead of the single view mass in the traditional image retrieval framework. In the experiments, several visual features are used for retrieval evaluation. Both distance similarity measures, such as Euclidean distance, and k-NN regression model based non-distance similarity measures are used for comparison. Experimental study was carried out on a database with 126 biopsy-proven masses(63 mass pairs). Preliminary results showed that multi-view based retrieval approach achieves better retrieval accuracy than single-view based one, especially for the k-NN regression model based similairy metric.
乳腺癌是女性中最常见的恶性疾病。乳房x线肿块检索系统可以帮助放射科医师检索活检证实的与诊断相似的肿块,从而提高诊断的准确性。然而,尽管筛查乳房x光检查通常由同一乳房的双视图(MLO和CC)乳房x光检查组成,但大多数乳房CAD系统结合图像检索技术是基于单视图原则的,其中一个视图内的查询ROI是独立分析的。本文提出了一种基于多视图信息的乳房x线图像质量检索方法。在这项工作中,查询示例是一个多视图(MLO和CC)质量对,而不是传统图像检索框架中的单视图质量。在实验中,几种视觉特征被用于检索评价。距离相似度量(如欧几里得距离)和基于k-NN回归模型的非距离相似度量都用于比较。实验研究在一个包含126个活检证实的肿块(63对质量)的数据库中进行。初步结果表明,基于多视图的检索方法比基于单视图的检索方法具有更好的检索精度,特别是对于基于相似度量的k-NN回归模型。
{"title":"Improved mammographic mass retrieval performance using multi-view information","authors":"Wei Liu, Weidong Xu, Lihua Li, Shuang Li, Huanping Zhao, Juan Zhang","doi":"10.1109/BIBM.2010.5706601","DOIUrl":"https://doi.org/10.1109/BIBM.2010.5706601","url":null,"abstract":"Breast cancer is the most common malignant disease in women. Mammographic mass retrieval system can help radiologists to improve the diagnostic accuracy by retrieving biopsy-proven masses which are similar with the diagnostic ones. However, although screening mammograms usually consists of two-view(MLO and CC) mammography of the same breast, most breast CAD systems incorporate with image retrieval techniques are based on a single-view principle where query ROI within a view is analyzed independently. In this paper, a mammographic mass retrieval approach based on multi-view information is proposed. In this work, the query example is a multi-view(MLO and CC) mass pair instead of the single view mass in the traditional image retrieval framework. In the experiments, several visual features are used for retrieval evaluation. Both distance similarity measures, such as Euclidean distance, and k-NN regression model based non-distance similarity measures are used for comparison. Experimental study was carried out on a database with 126 biopsy-proven masses(63 mass pairs). Preliminary results showed that multi-view based retrieval approach achieves better retrieval accuracy than single-view based one, especially for the k-NN regression model based similairy metric.","PeriodicalId":275098,"journal":{"name":"2010 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115430883","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
CEO a cloud epistasis computing model in GWAS GWAS中的云上位计算模型
Pub Date : 2010-12-01 DOI: 10.1109/BIBM.2010.5706542
Zhengkui Wang, Yue Wang, K. Tan, L. Wong, D. Agrawal
The 1000 Genome project has made available a large number of single nucleotide polymorphisms (SNPs) for genome-wide association studies (GWAS). However, the large number of SNPs has also rendered the discovery of epistatic interactions of SNPs computationally expensive. Parallelizing the computation offers a promising solution. In this paper, we propose a cloud-based epistasis computing (CEO) model that examines all k-locus SNPs combinations to find statistically significant epistatic interactions efficiently. Our CEO model uses the MapReduce framework which can be executed both on user's own clusters or on a cloud environment. Our cloud-based solution offers elastic computing resources to users, and more importantly, makes our approach affordable and available to all end-users. We evaluate our CEO model on a cluster of more than 40 nodes. Our experiment results show that our CEO model is computationally flexible, scalable and practical.
千人基因组计划为全基因组关联研究(GWAS)提供了大量的单核苷酸多态性(snp)。然而,大量的snp也使得发现snp的上位相互作用的计算成本很高。并行计算提供了一个很有前途的解决方案。在本文中,我们提出了一个基于云的上位计算(CEO)模型,该模型检查所有k位点snp组合,以有效地找到统计上显着的上位相互作用。我们的CEO模型使用MapReduce框架,既可以在用户自己的集群上执行,也可以在云环境中执行。我们基于云的解决方案为用户提供了弹性计算资源,更重要的是,使我们的方法对所有最终用户都负担得起并可用。我们在一个超过40个节点的集群上评估我们的CEO模型。实验结果表明,该模型具有计算灵活性、可扩展性和实用性。
{"title":"CEO a cloud epistasis computing model in GWAS","authors":"Zhengkui Wang, Yue Wang, K. Tan, L. Wong, D. Agrawal","doi":"10.1109/BIBM.2010.5706542","DOIUrl":"https://doi.org/10.1109/BIBM.2010.5706542","url":null,"abstract":"The 1000 Genome project has made available a large number of single nucleotide polymorphisms (SNPs) for genome-wide association studies (GWAS). However, the large number of SNPs has also rendered the discovery of epistatic interactions of SNPs computationally expensive. Parallelizing the computation offers a promising solution. In this paper, we propose a cloud-based epistasis computing (CEO) model that examines all k-locus SNPs combinations to find statistically significant epistatic interactions efficiently. Our CEO model uses the MapReduce framework which can be executed both on user's own clusters or on a cloud environment. Our cloud-based solution offers elastic computing resources to users, and more importantly, makes our approach affordable and available to all end-users. We evaluate our CEO model on a cluster of more than 40 nodes. Our experiment results show that our CEO model is computationally flexible, scalable and practical.","PeriodicalId":275098,"journal":{"name":"2010 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"61 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126593575","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
An accurate classification of native and non-native protein-protein interactions using supervised and semi-supervised learning approaches 使用监督和半监督学习方法对天然和非天然蛋白质相互作用进行准确分类
Pub Date : 2010-12-01 DOI: 10.1109/BIBM.2010.5706560
Nan Zhao, Bin Pang, C. Shyu, Dmitry Korkin
The progress in experimental and computational structural biology has led to a rapid growth of experimentally resolved structures and computational models of proteinprotein interactions. However, distinguishing between the physiological and non-physiological interactions remains a challenging problem. In this work, two related problems of interface classification have been addressed. The first problem is concerned with classification of the physiological and crystal-packing interactions. The second problem deals with the classification of the physiological interactions, or their accurate models, and decoys obtained from the inaccurate docking models. We have defined a universal set of interface features and employed supervised and semi-supervised learning approaches to accurately classify the interactions in both problems. Furthermore, we formulated the second problem as a semi-supervised learning problem and employed a transductive SVM to improve the accuracy of classification. Finally, we showed that using the scoring functions from the obtained classifiers, one can improve the accuracy of the docking methods.
实验和计算结构生物学的进步导致了蛋白质相互作用的实验解决结构和计算模型的快速增长。然而,区分生理和非生理相互作用仍然是一个具有挑战性的问题。本文主要研究了两个相关的界面分类问题。第一个问题是关于生理和晶体堆积相互作用的分类。第二个问题涉及生理相互作用的分类,或它们的精确模型,以及从不准确的对接模型中获得的诱饵。我们定义了一组通用的接口特征,并使用监督和半监督学习方法来准确分类这两个问题中的交互。此外,我们将第二个问题表述为半监督学习问题,并采用了一个换向支持向量机来提高分类的准确性。最后,我们证明了使用得到的分类器的评分函数,可以提高对接方法的准确性。
{"title":"An accurate classification of native and non-native protein-protein interactions using supervised and semi-supervised learning approaches","authors":"Nan Zhao, Bin Pang, C. Shyu, Dmitry Korkin","doi":"10.1109/BIBM.2010.5706560","DOIUrl":"https://doi.org/10.1109/BIBM.2010.5706560","url":null,"abstract":"The progress in experimental and computational structural biology has led to a rapid growth of experimentally resolved structures and computational models of proteinprotein interactions. However, distinguishing between the physiological and non-physiological interactions remains a challenging problem. In this work, two related problems of interface classification have been addressed. The first problem is concerned with classification of the physiological and crystal-packing interactions. The second problem deals with the classification of the physiological interactions, or their accurate models, and decoys obtained from the inaccurate docking models. We have defined a universal set of interface features and employed supervised and semi-supervised learning approaches to accurately classify the interactions in both problems. Furthermore, we formulated the second problem as a semi-supervised learning problem and employed a transductive SVM to improve the accuracy of classification. Finally, we showed that using the scoring functions from the obtained classifiers, one can improve the accuracy of the docking methods.","PeriodicalId":275098,"journal":{"name":"2010 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127137279","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Structural parsimony: Reductions in sequence space 结构简约:序列空间的缩减
Pub Date : 2010-12-01 DOI: 10.1109/BIBM.2010.5706536
Roberto Blanco
Computational phylogenetics has historically neglected strict theoretical approaches that exploit the mathematical models beneath which it abstracts away the nuances of evolution. In particular, parsimony is conceptually simple and amenable to rigorous treatment, and has a clear analogue in graph theory, the Steiner tree. We present and refine the notion of sequence space as the soil from which all graph-theoretical methods arise, studying its structural properties and complexity with an eye on maximum parsimony. We therefrom introduce a basic set of very efficient implicit reductions that discard information with a fixed effect on the optimality of the solution, and show how it can be applied to large, real datasets.
计算系统发育学在历史上忽视了严格的理论方法,这些方法利用数学模型抽象出进化的细微差别。特别地,简约在概念上是简单的,可以严格处理,并且在图论中有一个清晰的类比,即斯坦纳树。我们提出并完善了序列空间的概念,作为所有图理论方法产生的土壤,研究了它的结构性质和复杂性,并着眼于最大简约性。因此,我们引入了一组非常有效的隐式约简,这些约简丢弃了对解决方案最优性有固定影响的信息,并展示了如何将其应用于大型真实数据集。
{"title":"Structural parsimony: Reductions in sequence space","authors":"Roberto Blanco","doi":"10.1109/BIBM.2010.5706536","DOIUrl":"https://doi.org/10.1109/BIBM.2010.5706536","url":null,"abstract":"Computational phylogenetics has historically neglected strict theoretical approaches that exploit the mathematical models beneath which it abstracts away the nuances of evolution. In particular, parsimony is conceptually simple and amenable to rigorous treatment, and has a clear analogue in graph theory, the Steiner tree. We present and refine the notion of sequence space as the soil from which all graph-theoretical methods arise, studying its structural properties and complexity with an eye on maximum parsimony. We therefrom introduce a basic set of very efficient implicit reductions that discard information with a fixed effect on the optimality of the solution, and show how it can be applied to large, real datasets.","PeriodicalId":275098,"journal":{"name":"2010 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"76 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126237688","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Robust hidden semi-Markov modeling of array CGH data 阵列CGH数据的鲁棒隐半马尔可夫建模
Pub Date : 2010-12-01 DOI: 10.1109/BIBM.2010.5706637
Jiarui Ding, Sohrab P. Shah
As an extension to hidden Markov models, the hidden semi-Markov models allow the probability distribution of staying in the same state to be a general distribution. Therefore, hidden semi-Markov models are good at modeling sequences with succession of homogenous zones by choosing appropriate state duration distributions. Hidden semi-Markov models are generative models. Most times they are trained by maximum likelihood estimation. To compensate model mis-specification and provide protection against outliers, hidden semi-Markov models can be trained discriminatively given a labeled training set at the expense of increased training complexity. As an alternative to discriminative training, in this paper, we consider model mis-specification and outliers by adopting robust methods. Specifically, we use Student's t mixture models as the emission distributions of hidden semi-Markov models. The proposed robust hidden semi-Markov models are used to model array based comparative genomic hybridization data. Experiments conducted on the benchmark data from the Coriell cell lines, and the glioblastoma multiforme data illustrate the reliability of the technique.
作为隐马尔可夫模型的扩展,隐半马尔可夫模型允许保持同一状态的概率分布为一般分布。因此,隐半马尔可夫模型通过选择合适的状态持续时间分布,可以很好地对具有连续齐次带的序列进行建模。隐半马尔可夫模型是生成模型。大多数情况下,它们是通过最大似然估计来训练的。为了补偿模型的错误规范并提供对异常值的保护,隐式半马尔可夫模型可以在给定标记训练集的情况下进行判别训练,但代价是增加训练复杂性。作为判别训练的一种替代方法,本文采用鲁棒性方法考虑了模型不规范和异常值。具体来说,我们使用Student’st混合模型作为隐藏半马尔可夫模型的发射分布。利用所提出的鲁棒隐半马尔可夫模型对基于阵列的比较基因组杂交数据进行建模。以科里尔细胞系的基准数据和胶质母细胞瘤多形性数据进行的实验说明了该技术的可靠性。
{"title":"Robust hidden semi-Markov modeling of array CGH data","authors":"Jiarui Ding, Sohrab P. Shah","doi":"10.1109/BIBM.2010.5706637","DOIUrl":"https://doi.org/10.1109/BIBM.2010.5706637","url":null,"abstract":"As an extension to hidden Markov models, the hidden semi-Markov models allow the probability distribution of staying in the same state to be a general distribution. Therefore, hidden semi-Markov models are good at modeling sequences with succession of homogenous zones by choosing appropriate state duration distributions. Hidden semi-Markov models are generative models. Most times they are trained by maximum likelihood estimation. To compensate model mis-specification and provide protection against outliers, hidden semi-Markov models can be trained discriminatively given a labeled training set at the expense of increased training complexity. As an alternative to discriminative training, in this paper, we consider model mis-specification and outliers by adopting robust methods. Specifically, we use Student's t mixture models as the emission distributions of hidden semi-Markov models. The proposed robust hidden semi-Markov models are used to model array based comparative genomic hybridization data. Experiments conducted on the benchmark data from the Coriell cell lines, and the glioblastoma multiforme data illustrate the reliability of the technique.","PeriodicalId":275098,"journal":{"name":"2010 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125862815","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Truncation of protein sequences for fast profile alignment with application to subcellular localization 截断蛋白质序列用于快速定位与亚细胞定位的应用
Pub Date : 2010-12-01 DOI: 10.1109/BIBM.2010.5706548
M. Mak, Wei Wang, S. Kung
We have recently found that the computation time of homology-based subcellular localization can be substantially reduced by aligning profiles up to the cleavage site positions of signal peptides, mitochondrial targeting peptides, and chloro-plast transit peptides [1]. While the method can reduce the profile alignment time by as much as 20 folds, it cannot reduce the computation time spent on creating the profiles. In this paper, we propose a new approach that can reduce both the profile creation time and profile alignment time. In the new approach, instead of cutting the profiles, we shorten the sequences by cutting them at the cleavage site locations. The shortened sequences are then presented to PSI-BLAST to compute the profiles. Experimental results and analysis of profile-alignment score matrices suggest that both profile creation time and profile alignment time can be reduced without sacrificing subcellular localization accuracy. Once a pairwise profile-alignment score matrix has been obtained, a one-vs-rest SVM classifier can be trained. To further reduce the training and recognition time of the classifier, we propose a perturbation discriminant analysis (PDA) technique. It was found that PDA enjoys a short training time as compared to the conventional SVM.
我们最近发现,通过对准信号肽、线粒体靶向肽和叶绿体转运肽的切割位点位置,可以大大减少基于同源的亚细胞定位的计算时间[1]。虽然该方法可以将轮廓线对齐时间减少20倍,但它不能减少创建轮廓线所花费的计算时间。在本文中,我们提出了一种既可以减少轮廓创建时间又可以减少轮廓对齐时间的新方法。在新的方法中,我们通过在解理位点处切割来缩短序列,而不是切割剖面。然后将缩短的序列提交给PSI-BLAST来计算剖面。实验结果和对轮廓线对齐得分矩阵的分析表明,在不牺牲亚细胞定位精度的情况下,可以减少轮廓线创建时间和轮廓线对齐时间。一旦获得了成对的轮廓对齐评分矩阵,就可以训练出一个一对一的支持向量机分类器。为了进一步减少分类器的训练和识别时间,我们提出了一种微扰判别分析(PDA)技术。结果表明,与传统支持向量机相比,PDA的训练时间较短。
{"title":"Truncation of protein sequences for fast profile alignment with application to subcellular localization","authors":"M. Mak, Wei Wang, S. Kung","doi":"10.1109/BIBM.2010.5706548","DOIUrl":"https://doi.org/10.1109/BIBM.2010.5706548","url":null,"abstract":"We have recently found that the computation time of homology-based subcellular localization can be substantially reduced by aligning profiles up to the cleavage site positions of signal peptides, mitochondrial targeting peptides, and chloro-plast transit peptides [1]. While the method can reduce the profile alignment time by as much as 20 folds, it cannot reduce the computation time spent on creating the profiles. In this paper, we propose a new approach that can reduce both the profile creation time and profile alignment time. In the new approach, instead of cutting the profiles, we shorten the sequences by cutting them at the cleavage site locations. The shortened sequences are then presented to PSI-BLAST to compute the profiles. Experimental results and analysis of profile-alignment score matrices suggest that both profile creation time and profile alignment time can be reduced without sacrificing subcellular localization accuracy. Once a pairwise profile-alignment score matrix has been obtained, a one-vs-rest SVM classifier can be trained. To further reduce the training and recognition time of the classifier, we propose a perturbation discriminant analysis (PDA) technique. It was found that PDA enjoys a short training time as compared to the conventional SVM.","PeriodicalId":275098,"journal":{"name":"2010 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"396 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125924962","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Sequence and structural features of binding site residues in protein-protein complexes 蛋白质-蛋白质复合物中结合位点残基的序列和结构特征
Pub Date : 2010-12-01 DOI: 10.1109/BIBM.2010.5706535
M. Gromiha, N. Saranya, S. Selvaraj, B. Jayaram, K. Fukui
We have developed an energy based approach for identifying the binding site residues in protein-protein complexes. The binding site residues have been analyzed with sequence and structure based parameters such as neighboring residues in the vicinity of binding sites and conformational switching. We observed specific preferences of dipeptides and tripeptides for binding, which is unique to proteinprotein complexes. Our analysis showed that 7% of residues changed their conformations upon proteinprotein complex formation and it is 9.2% and 6.6% in the binding and non-binding sites, respectively. Specifically, the residues Glu, Lys, Leu and Ser changed their conformation from coil to helix/strand and from helix to coil/strand. Leu, Ser, Thr and Val prefer to change their conformation from strand to coil/helix. The results obtained in this study will be helpful for understanding and predicting the binding sites in protein-protein complexes.
我们已经开发了一种基于能量的方法来识别蛋白质-蛋白质复合物中的结合位点残基。结合位点残基的序列和结构参数包括结合位点附近的邻近残基和构象开关。我们观察到二肽和三肽结合的特定偏好,这是蛋白质蛋白质复合物所特有的。我们的分析表明,7%的残基在蛋白质复合物形成时改变了它们的构象,在结合位点和非结合位点分别为9.2%和6.6%。其中,Glu、Lys、Leu和Ser的构象由螺旋变为螺旋/链,由螺旋变为螺旋/链。Leu, Ser, Thr和Val倾向于将它们的构象从股状变为螺旋状。本研究结果将有助于理解和预测蛋白质-蛋白质复合物的结合位点。
{"title":"Sequence and structural features of binding site residues in protein-protein complexes","authors":"M. Gromiha, N. Saranya, S. Selvaraj, B. Jayaram, K. Fukui","doi":"10.1109/BIBM.2010.5706535","DOIUrl":"https://doi.org/10.1109/BIBM.2010.5706535","url":null,"abstract":"We have developed an energy based approach for identifying the binding site residues in protein-protein complexes. The binding site residues have been analyzed with sequence and structure based parameters such as neighboring residues in the vicinity of binding sites and conformational switching. We observed specific preferences of dipeptides and tripeptides for binding, which is unique to proteinprotein complexes. Our analysis showed that 7% of residues changed their conformations upon proteinprotein complex formation and it is 9.2% and 6.6% in the binding and non-binding sites, respectively. Specifically, the residues Glu, Lys, Leu and Ser changed their conformation from coil to helix/strand and from helix to coil/strand. Leu, Ser, Thr and Val prefer to change their conformation from strand to coil/helix. The results obtained in this study will be helpful for understanding and predicting the binding sites in protein-protein complexes.","PeriodicalId":275098,"journal":{"name":"2010 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128017502","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Toward automatically drawn metabolic pathway atlas with peripheral node abstraction algorithm 利用外周节点提取算法自动绘制代谢途径图谱
Pub Date : 2010-12-01 DOI: 10.1109/BIBM.2010.5706644
Myungha Jang, A. Rhie, Hyun Seok Park
Graphical layout techniques serve a vital part in systems biology to enhance understanding and visualization of chemical reaction pathways in our body. Metabolic networks have particularly complex binding structures, making its graphical representation challenging to comprehend. For the purpose of legibility, reducing graph complexity in metabolic networks is crucial when working with large number of nodes and edges. This paper introduces a node abstraction algorithm that treats metabolic pathways as hierarchical networks and considers reactions between compound pairs-the equivalent of node pairs in the context of biological networks-as an elastic parameter for reaction compression in an automated way. Substrates and products that locally compose reactions with low connectivity were reduced, and cyclical or hierarchical pathways were aligned according to their structural composition.
图形布局技术在系统生物学中起到了至关重要的作用,增强了对我们体内化学反应途径的理解和可视化。代谢网络具有特别复杂的结合结构,使其图形表示难以理解。为了提高易读性,在处理大量节点和边时,降低代谢网络的图复杂度是至关重要的。本文介绍了一种节点抽象算法,该算法将代谢途径视为分层网络,并将复合对(相当于生物网络中的节点对)之间的反应视为自动化反应压缩的弹性参数。减少了局部形成低连通性反应的底物和产物,并根据其结构组成排列了循环或分层途径。
{"title":"Toward automatically drawn metabolic pathway atlas with peripheral node abstraction algorithm","authors":"Myungha Jang, A. Rhie, Hyun Seok Park","doi":"10.1109/BIBM.2010.5706644","DOIUrl":"https://doi.org/10.1109/BIBM.2010.5706644","url":null,"abstract":"Graphical layout techniques serve a vital part in systems biology to enhance understanding and visualization of chemical reaction pathways in our body. Metabolic networks have particularly complex binding structures, making its graphical representation challenging to comprehend. For the purpose of legibility, reducing graph complexity in metabolic networks is crucial when working with large number of nodes and edges. This paper introduces a node abstraction algorithm that treats metabolic pathways as hierarchical networks and considers reactions between compound pairs-the equivalent of node pairs in the context of biological networks-as an elastic parameter for reaction compression in an automated way. Substrates and products that locally compose reactions with low connectivity were reduced, and cyclical or hierarchical pathways were aligned according to their structural composition.","PeriodicalId":275098,"journal":{"name":"2010 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"53 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127911880","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Detection and application of CagA sequence markers for assessing risk factor of gastric cancer caused by Helicobacter pylori CagA序列标记物在幽门螺杆菌致胃癌危险因素评估中的检测及应用
Pub Date : 2010-12-01 DOI: 10.1109/BIBM.2010.5706614
Chao Zhang, Shunfu Xu, Dong Xu
As a marker of Helicobacter pylori, Cytotoxin-associated gene A (CagA) has been revealed to be the major virulence factor to cause gastroduodenal diseases. However, the molecular mechanisms that underlie the development of different gastroduodenal diseases caused by cagA-positive H. pylori infection remain unknown. Current studies are mainly limited to the relationship between EPIYA motifs in the CagA strain and diseases, but such a relationship is insufficient to explain the diversity of diseases. We propose a new and systematic method to analyze the relationship between the whole CagA sequence patterns and diseases. For this purpose, we introduced entropy calculation to detect key residues of CagA as the gastric cancer biomarkers, and then employed a supervised learning procedure to classify the cancer and non-cancer related CagA strains by using the key residues. We achieved 76% and 71% classification accuracy for Western and East Asian subtypes, respectively. Our study may help establish H. pylori biomarkers for predicting gastroduodenal disease outcome.
细胞毒素相关基因a (Cytotoxin-associated gene a, CagA)作为幽门螺杆菌的标志物,是引起胃十二指肠疾病的主要毒力因子。然而,由caga阳性幽门螺杆菌感染引起的不同胃十二指肠疾病发展的分子机制尚不清楚。目前的研究主要局限于CagA菌株中EPIYA基序与疾病的关系,但这种关系不足以解释疾病的多样性。我们提出了一种新的系统的方法来分析整个CagA序列模式与疾病之间的关系。为此,我们引入熵计算来检测CagA关键残基作为胃癌生物标志物,然后利用关键残基采用监督学习方法对胃癌和非癌症相关的CagA菌株进行分类。我们对西亚和东亚亚型的分类准确率分别达到76%和71%。我们的研究可能有助于建立预测胃十二指肠疾病预后的幽门螺杆菌生物标志物。
{"title":"Detection and application of CagA sequence markers for assessing risk factor of gastric cancer caused by Helicobacter pylori","authors":"Chao Zhang, Shunfu Xu, Dong Xu","doi":"10.1109/BIBM.2010.5706614","DOIUrl":"https://doi.org/10.1109/BIBM.2010.5706614","url":null,"abstract":"As a marker of Helicobacter pylori, Cytotoxin-associated gene A (CagA) has been revealed to be the major virulence factor to cause gastroduodenal diseases. However, the molecular mechanisms that underlie the development of different gastroduodenal diseases caused by cagA-positive H. pylori infection remain unknown. Current studies are mainly limited to the relationship between EPIYA motifs in the CagA strain and diseases, but such a relationship is insufficient to explain the diversity of diseases. We propose a new and systematic method to analyze the relationship between the whole CagA sequence patterns and diseases. For this purpose, we introduced entropy calculation to detect key residues of CagA as the gastric cancer biomarkers, and then employed a supervised learning procedure to classify the cancer and non-cancer related CagA strains by using the key residues. We achieved 76% and 71% classification accuracy for Western and East Asian subtypes, respectively. Our study may help establish H. pylori biomarkers for predicting gastroduodenal disease outcome.","PeriodicalId":275098,"journal":{"name":"2010 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130478213","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
期刊
2010 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1