Andrey Palyanov, Sergey Khayrulin, Stephen D Larson, Alexander Dibert
The nematode C. elegans is the only animal with a known neuronal wiring diagram, or "connectome". During the last three decades, extensive studies of the C. elegans have provided wide-ranging data about it, but few systematic ways of integrating these data into a dynamic model have been put forward. Here we present a detailed demonstration of a virtual C. elegans aimed at integrating these data in the form of a 3D dynamic model operating in a simulated physical environment. Our current demonstration includes a realistic flexible worm body model, muscular system and a partially implemented ventral neural cord. Our virtual C. elegans demonstrates successful forward and backward locomotion when sending sinusoidal patterns of neuronal activity to groups of motor neurons. To account for the relatively slow propagation velocity and the attenuation of neuronal signals, we introduced "pseudo neurons" into our model to simulate simplified neuronal dynamics. The pseudo neurons also provide a good way of visualizing the nervous system's structure and activity dynamics.
{"title":"Towards a virtual C. elegans: a framework for simulation and visualization of the neuromuscular system in a 3D physical environment.","authors":"Andrey Palyanov, Sergey Khayrulin, Stephen D Larson, Alexander Dibert","doi":"10.3233/ISB-2012-0445","DOIUrl":"https://doi.org/10.3233/ISB-2012-0445","url":null,"abstract":"<p><p>The nematode C. elegans is the only animal with a known neuronal wiring diagram, or \"connectome\". During the last three decades, extensive studies of the C. elegans have provided wide-ranging data about it, but few systematic ways of integrating these data into a dynamic model have been put forward. Here we present a detailed demonstration of a virtual C. elegans aimed at integrating these data in the form of a 3D dynamic model operating in a simulated physical environment. Our current demonstration includes a realistic flexible worm body model, muscular system and a partially implemented ventral neural cord. Our virtual C. elegans demonstrates successful forward and backward locomotion when sending sinusoidal patterns of neuronal activity to groups of motor neurons. To account for the relatively slow propagation velocity and the attenuation of neuronal signals, we introduced \"pseudo neurons\" into our model to simulate simplified neuronal dynamics. The pseudo neurons also provide a good way of visualizing the nervous system's structure and activity dynamics.</p>","PeriodicalId":39379,"journal":{"name":"In Silico Biology","volume":"11 3-4","pages":"137-47"},"PeriodicalIF":0.0,"publicationDate":"2011-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.3233/ISB-2012-0445","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"30870652","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Protein structures having knotted configurations in their native fold, have great impact in their function. Protein knot localization has become possible in single molecule experiments though they are identified in their structure level. Signal processing methods which have played an important role to analyse genomic and proteomic sequences are also useful for knot protein analysis. The amino acid index hydrophobicity contributes the knowledge of stability of proteins. Water capture and release is found to be controllable by the tightening force in knots which are related to this index. It is observed that, the knot proteins are of hydrophobic in nature by Fourier analysis, Power spectral density estimation and Cross correlation method. The set of knot proteins from proteinKNOT web server(pKNOT) has been used for the experimentation and proved 93% of them are of hydrophobic nature in their knotted core.
蛋白质结构在其天然折叠中具有打结构型,对其功能有很大影响。蛋白结定位虽然在结构水平上得到了鉴定,但在单分子实验中已成为可能。信号处理方法在基因组和蛋白质组学序列分析中发挥了重要作用,也可用于结蛋白分析。氨基酸疏水性指数有助于了解蛋白质的稳定性。发现水的捕获和释放是由与该指数有关的结的拧紧力控制的。通过傅里叶分析、功率谱密度估计和相互关系分析发现,结蛋白具有疏水性。利用proteinKNOT web server(pKNOT)上的结蛋白集进行实验,结果证明93%的结蛋白在其结核中具有疏水性。
{"title":"Hydrophobic tint of knot proteins","authors":"P. Anto, S. N. Achuthsankar","doi":"10.1145/1722024.1722034","DOIUrl":"https://doi.org/10.1145/1722024.1722034","url":null,"abstract":"Protein structures having knotted configurations in their native fold, have great impact in their function. Protein knot localization has become possible in single molecule experiments though they are identified in their structure level. Signal processing methods which have played an important role to analyse genomic and proteomic sequences are also useful for knot protein analysis. The amino acid index hydrophobicity contributes the knowledge of stability of proteins. Water capture and release is found to be controllable by the tightening force in knots which are related to this index. It is observed that, the knot proteins are of hydrophobic in nature by Fourier analysis, Power spectral density estimation and Cross correlation method. The set of knot proteins from proteinKNOT web server(pKNOT) has been used for the experimentation and proved 93% of them are of hydrophobic nature in their knotted core.","PeriodicalId":39379,"journal":{"name":"In Silico Biology","volume":"1 1","pages":"8"},"PeriodicalIF":0.0,"publicationDate":"2010-02-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1145/1722024.1722034","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"64107794","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Protein Structure Prediction is important in the sense that it helps to extend knowledge about the understanding of protein structures and functions. The knowledge is essential for prediction of secondary structures of unknown proteins required for applications related to drug discovery. A novel technique for protein secondary structure prediction is presented here. In this work, two levels of multi-layer feed forward neural networks are used. In the first level network, sequence profiles from PSI-BLAST and physicochemical properties of amino acids are used for sequence to structure predictions. Confidence values of forming helix, sheet and coil, obtained from the first level network are then used with the second level network for structure to structure predictions. The overall prediction accuracy as obtained through experimentation is in the range of 75.58% to 77.48%. This method is trained and tested with nrDSSP datasets using four folds cross validation. It is also tested on target proteins of Critical Assessment of Protein Structure Prediction Experiment (CASP3) and achieves better results than PSIPRED over some target proteins.
{"title":"Improving prediction of protein secondary structure using physicochemical properties of amino acids","authors":"P. Chatterjee, Subhadip Basu, M. Nasipuri","doi":"10.1145/1722024.1722036","DOIUrl":"https://doi.org/10.1145/1722024.1722036","url":null,"abstract":"Protein Structure Prediction is important in the sense that it helps to extend knowledge about the understanding of protein structures and functions. The knowledge is essential for prediction of secondary structures of unknown proteins required for applications related to drug discovery. A novel technique for protein secondary structure prediction is presented here. In this work, two levels of multi-layer feed forward neural networks are used. In the first level network, sequence profiles from PSI-BLAST and physicochemical properties of amino acids are used for sequence to structure predictions. Confidence values of forming helix, sheet and coil, obtained from the first level network are then used with the second level network for structure to structure predictions. The overall prediction accuracy as obtained through experimentation is in the range of 75.58% to 77.48%. This method is trained and tested with nrDSSP datasets using four folds cross validation. It is also tested on target proteins of Critical Assessment of Protein Structure Prediction Experiment (CASP3) and achieves better results than PSIPRED over some target proteins.","PeriodicalId":39379,"journal":{"name":"In Silico Biology","volume":"1 1","pages":"10"},"PeriodicalIF":0.0,"publicationDate":"2010-02-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1145/1722024.1722036","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"64107844","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A bicluster in gene expression dataset is a subset of genes that exhibit similar expression patterns through a subset of conditions. In this work biclusters are identified in two steps. In the first step high quality bicluster seeds are generated using KMeans clustering algorithm. These seeds are then enlarged using Reactive Greedy Randomized Adaptive Search Procedure (RGRASP) which is a multi-start metaheuristic method in which there are two phases, construction and local search. The objective here is to identify biclusters of maximum size with MSR lower than a given threshold. Experiments are conducted on both Yeast and Human Lymphoma datasets. The Experimental results on the benchmark datasets demonstrate that RGRASP is capable of identifying high quality biclusters compared to many of the already existing biclustering algorithms. Compared to the already existing algorithm based on the same RGRASP metaheuristics biclusters with larger size and lower mean squared residue are obtained using this algorithm in Yeast dataset. Moreover in this study the RGRASP is applied for the first time to find biclusters from the Human Lymphoma dataset.
{"title":"Application of reactive GRASP to the biclustering of gene expression data","authors":"Shyama Das, S. M. Idicula","doi":"10.1145/1722024.1722041","DOIUrl":"https://doi.org/10.1145/1722024.1722041","url":null,"abstract":"A bicluster in gene expression dataset is a subset of genes that exhibit similar expression patterns through a subset of conditions. In this work biclusters are identified in two steps. In the first step high quality bicluster seeds are generated using KMeans clustering algorithm. These seeds are then enlarged using Reactive Greedy Randomized Adaptive Search Procedure (RGRASP) which is a multi-start metaheuristic method in which there are two phases, construction and local search. The objective here is to identify biclusters of maximum size with MSR lower than a given threshold. Experiments are conducted on both Yeast and Human Lymphoma datasets. The Experimental results on the benchmark datasets demonstrate that RGRASP is capable of identifying high quality biclusters compared to many of the already existing biclustering algorithms. Compared to the already existing algorithm based on the same RGRASP metaheuristics biclusters with larger size and lower mean squared residue are obtained using this algorithm in Yeast dataset. Moreover in this study the RGRASP is applied for the first time to find biclusters from the Human Lymphoma dataset.","PeriodicalId":39379,"journal":{"name":"In Silico Biology","volume":"1 1","pages":"14"},"PeriodicalIF":0.0,"publicationDate":"2010-02-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1145/1722024.1722041","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"64107881","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abhijit J. Kulkarni, A. Noronha, Sasanka Roy, S. Angadi
Text mining is an important research area in applied statistics. The present article addresses an important problem from the Bioinformatics field, viz. classification of protein sequences as soluble proteins and inclusion body forming proteins when over-expressed in Escherichia coli (E. coli), using text mining and machine learning techniques. We propose a text mining based algorithm to extract patterns from the protein sequences that are later used in support vector classification algorithm. We report the best classification results for this dataset compared to the existing state of the art. Our algorithm is quite general and can be applied to any biological text data. The extracted patterns may give further insight in underlying dynamics of the sequences that decide the corresponding class membership.
{"title":"Fuzzy pattern extraction for classification of protein sequences","authors":"Abhijit J. Kulkarni, A. Noronha, Sasanka Roy, S. Angadi","doi":"10.1145/1722024.1722046","DOIUrl":"https://doi.org/10.1145/1722024.1722046","url":null,"abstract":"Text mining is an important research area in applied statistics. The present article addresses an important problem from the Bioinformatics field, viz. classification of protein sequences as soluble proteins and inclusion body forming proteins when over-expressed in Escherichia coli (E. coli), using text mining and machine learning techniques. We propose a text mining based algorithm to extract patterns from the protein sequences that are later used in support vector classification algorithm. We report the best classification results for this dataset compared to the existing state of the art. Our algorithm is quite general and can be applied to any biological text data. The extracted patterns may give further insight in underlying dynamics of the sequences that decide the corresponding class membership.","PeriodicalId":39379,"journal":{"name":"In Silico Biology","volume":"1 1","pages":"18"},"PeriodicalIF":0.0,"publicationDate":"2010-02-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1145/1722024.1722046","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"64107923","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Amit Nagal, Swapnil R. Jaiswal, H. Yadav, M. Mohan, P. Ghosh
There are many Drug used in the treatment of inflammation disease like NSAIDS but there limitation encouraging more research in inflammatory related diseases. Phospholipases A2 (PLA2s) are enzymes that catalyze the hydrolysis of the sn-2 acyl ester linkage of phospholipids, producing fatty acids and lysophospholipids. Their enzymatic activity is a rate-limiting step in the formation of arachidonic acid and subsequently in the synthesis of leukotrienes and prostaglandins. The current Structure Based Drug Designing approach analysis and comparative docking studies of various hnps-PLA2 indole inhibitor derivatives have shown that they act better in compare with other molecules. ADME studies shows that indole derivatives would be potential of being a safe drug
{"title":"Study of indole inhibitors to increase the affinity of hnps-PLA2 in inflammatory disease","authors":"Amit Nagal, Swapnil R. Jaiswal, H. Yadav, M. Mohan, P. Ghosh","doi":"10.1145/1722024.1722056","DOIUrl":"https://doi.org/10.1145/1722024.1722056","url":null,"abstract":"There are many Drug used in the treatment of inflammation disease like NSAIDS but there limitation encouraging more research in inflammatory related diseases. Phospholipases A2 (PLA2s) are enzymes that catalyze the hydrolysis of the sn-2 acyl ester linkage of phospholipids, producing fatty acids and lysophospholipids. Their enzymatic activity is a rate-limiting step in the formation of arachidonic acid and subsequently in the synthesis of leukotrienes and prostaglandins. The current Structure Based Drug Designing approach analysis and comparative docking studies of various hnps-PLA2 indole inhibitor derivatives have shown that they act better in compare with other molecules. ADME studies shows that indole derivatives would be potential of being a safe drug","PeriodicalId":39379,"journal":{"name":"In Silico Biology","volume":"15 1","pages":"27"},"PeriodicalIF":0.0,"publicationDate":"2010-02-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1145/1722024.1722056","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"64108413","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jyotshna Dongardive, Aarti Patil, A. Bir, S. Jamkhedkar, Siby Abraham
The paper proposes a novel methodology for finding motifs of biological data. It uses music inspired meta-heuristic optimization technique called harmony search to find motif. The model is based on randomly generated l-mers as the initial harmony memory. Pitch adjustment and random selection are used to generate new l-mers, which are adjudged by a specially defined objective function. The proposed method is experimentally validated using sequences of Human Papillomavirus strains obtained from accredited and authorized sources.
{"title":"Finding motifs using harmony search","authors":"Jyotshna Dongardive, Aarti Patil, A. Bir, S. Jamkhedkar, Siby Abraham","doi":"10.1145/1722024.1722072","DOIUrl":"https://doi.org/10.1145/1722024.1722072","url":null,"abstract":"The paper proposes a novel methodology for finding motifs of biological data. It uses music inspired meta-heuristic optimization technique called harmony search to find motif. The model is based on randomly generated l-mers as the initial harmony memory. Pitch adjustment and random selection are used to generate new l-mers, which are adjudged by a specially defined objective function. The proposed method is experimentally validated using sequences of Human Papillomavirus strains obtained from accredited and authorized sources.","PeriodicalId":39379,"journal":{"name":"In Silico Biology","volume":"1 1","pages":"41"},"PeriodicalIF":0.0,"publicationDate":"2010-02-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1145/1722024.1722072","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"64108497","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Biclustering is a very useful data mining technique which identifies coherent patterns from microarray gene expression data. A bicluster of a gene expression dataset is a subset of genes which exhibit similar expression patterns along a subset of conditions. Biclustering is a powerful analytical tool for the biologist and has generated considerable interest over the past few decades. The problem of locating the most significant biclusters in gene expression data has shown to be NP complete. In this paper a PSO based algorithm is developed for biclustering gene expression data. This algorithm has three steps. In the first step high quality bicluster seeds are generated using KMeans clustering algorithm. From these seeds biclusters are generated using particle swarm optimization. In the third stage an iterative search is performed to check the possibility of adding more genes and conditions within the given threshold value of mean squared residue score. Experimental results on real datasets show that our approach can effectively find high quality biclusters.
{"title":"Biclustering gene expression data using KMeans-binary PSO hybrid","authors":"Shyama Das, S. M. Idicula","doi":"10.1145/1722024.1722074","DOIUrl":"https://doi.org/10.1145/1722024.1722074","url":null,"abstract":"Biclustering is a very useful data mining technique which identifies coherent patterns from microarray gene expression data. A bicluster of a gene expression dataset is a subset of genes which exhibit similar expression patterns along a subset of conditions. Biclustering is a powerful analytical tool for the biologist and has generated considerable interest over the past few decades. The problem of locating the most significant biclusters in gene expression data has shown to be NP complete. In this paper a PSO based algorithm is developed for biclustering gene expression data. This algorithm has three steps. In the first step high quality bicluster seeds are generated using KMeans clustering algorithm. From these seeds biclusters are generated using particle swarm optimization. In the third stage an iterative search is performed to check the possibility of adding more genes and conditions within the given threshold value of mean squared residue score. Experimental results on real datasets show that our approach can effectively find high quality biclusters.","PeriodicalId":39379,"journal":{"name":"In Silico Biology","volume":"1 1","pages":"43"},"PeriodicalIF":0.0,"publicationDate":"2010-02-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1145/1722024.1722074","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"64108581","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Transcriptional regulatory mechanisms are mediated by a set of transcription factors (TFs), which bind to a specific region (motifs or transcription factor binding sites, TFBS), on the target gene(s) leading to gene expression. Eukaryotic regulatory motifs, referred to as cis regulatory modules (CRMs), tend to co-occur near the regulated gene's transcription start site and provide the building blocks to transcriptional regulatory networks that model the relevant TF-TFBS interactions. Here, we study IL-12 stimulated transcriptional regulators in STAT4 mediated T helper 1 (Th1) cell development by focusing on the identification of TFBS and CRMs using a set of Stat4 ChIP-on-chip target genes. A region containing 2000 bases of Mus musculus sequences with the Stat4 binding site, derived from the ChIP-on-chip data, has been characterized for enrichment of other motifs and, thus CRMs. We find two such motifs, (NF-κB and PPARγ/RXR) being enriched in the Stat4 binding sequences compared to neighboring background sequences and sets of random sequences of equal size. Furthermore, these predicted CRMs were observed to be associated with biologically relevant target genes in the ChIP-on-chip data set by meaningful gene ontology annotations. These analyses will lead to a better understanding of transcriptional regulatory networks in IL-12 stimulated Stat4 mediated Th1 cell differentiation.
{"title":"Cis regulatory module discovery in immune cell development","authors":"S. R. Ganakammal, M. Kaplan, N. Perumal","doi":"10.1145/1722024.1722039","DOIUrl":"https://doi.org/10.1145/1722024.1722039","url":null,"abstract":"Transcriptional regulatory mechanisms are mediated by a set of transcription factors (TFs), which bind to a specific region (motifs or transcription factor binding sites, TFBS), on the target gene(s) leading to gene expression. Eukaryotic regulatory motifs, referred to as cis regulatory modules (CRMs), tend to co-occur near the regulated gene's transcription start site and provide the building blocks to transcriptional regulatory networks that model the relevant TF-TFBS interactions. Here, we study IL-12 stimulated transcriptional regulators in STAT4 mediated T helper 1 (Th1) cell development by focusing on the identification of TFBS and CRMs using a set of Stat4 ChIP-on-chip target genes. A region containing 2000 bases of Mus musculus sequences with the Stat4 binding site, derived from the ChIP-on-chip data, has been characterized for enrichment of other motifs and, thus CRMs. We find two such motifs, (NF-κB and PPARγ/RXR) being enriched in the Stat4 binding sequences compared to neighboring background sequences and sets of random sequences of equal size. Furthermore, these predicted CRMs were observed to be associated with biologically relevant target genes in the ChIP-on-chip data set by meaningful gene ontology annotations. These analyses will lead to a better understanding of transcriptional regulatory networks in IL-12 stimulated Stat4 mediated Th1 cell differentiation.","PeriodicalId":39379,"journal":{"name":"In Silico Biology","volume":"32 1","pages":"12"},"PeriodicalIF":0.0,"publicationDate":"2010-02-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1145/1722024.1722039","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"64107849","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The search of structural motifs that specify the spatial arrangement of polypeptide segments is preferred over other methods such as common substructure discovery and structural superposition in comparing protein structures. 3D protein structures can be modeled as graphs whose maximum degree is bounded by a constant. Structural motifs can also be modeled as graphs and a significant percentage of them are trees. Thus, motif search in proteins can be modeled as an enumeration of isomorphic subgraphs where a query tree Q with m nodes is searched in a sparse graph G with n nodes and the maximum degree of any node in G is bounded by a constant ε. We design an efficient divide-and-conquer algorithm that finds all copies of Q in G by partitioning Q using a minimum dominating set. This strategy can be extended to sparse query graphs that can be reduced to trees by deleting a small number of edges.
{"title":"Complete enumeration of compact structural motifs in proteins","authors":"Bhadrachalam Chitturi, D. Bein, N. Grishin","doi":"10.1145/1722024.1722047","DOIUrl":"https://doi.org/10.1145/1722024.1722047","url":null,"abstract":"The search of structural motifs that specify the spatial arrangement of polypeptide segments is preferred over other methods such as common substructure discovery and structural superposition in comparing protein structures. 3D protein structures can be modeled as graphs whose maximum degree is bounded by a constant. Structural motifs can also be modeled as graphs and a significant percentage of them are trees. Thus, motif search in proteins can be modeled as an enumeration of isomorphic subgraphs where a query tree Q with m nodes is searched in a sparse graph G with n nodes and the maximum degree of any node in G is bounded by a constant ε. We design an efficient divide-and-conquer algorithm that finds all copies of Q in G by partitioning Q using a minimum dominating set. This strategy can be extended to sparse query graphs that can be reduced to trees by deleting a small number of edges.","PeriodicalId":39379,"journal":{"name":"In Silico Biology","volume":"1 1","pages":"19"},"PeriodicalIF":0.0,"publicationDate":"2010-02-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1145/1722024.1722047","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"64108140","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}