首页 > 最新文献

In Silico Biology最新文献

英文 中文
Towards a virtual C. elegans: a framework for simulation and visualization of the neuromuscular system in a 3D physical environment. 迈向虚拟秀丽隐杆线虫:在三维物理环境中模拟和可视化神经肌肉系统的框架。
Q2 Medicine Pub Date : 2011-01-01 DOI: 10.3233/ISB-2012-0445
Andrey Palyanov, Sergey Khayrulin, Stephen D Larson, Alexander Dibert

The nematode C. elegans is the only animal with a known neuronal wiring diagram, or "connectome". During the last three decades, extensive studies of the C. elegans have provided wide-ranging data about it, but few systematic ways of integrating these data into a dynamic model have been put forward. Here we present a detailed demonstration of a virtual C. elegans aimed at integrating these data in the form of a 3D dynamic model operating in a simulated physical environment. Our current demonstration includes a realistic flexible worm body model, muscular system and a partially implemented ventral neural cord. Our virtual C. elegans demonstrates successful forward and backward locomotion when sending sinusoidal patterns of neuronal activity to groups of motor neurons. To account for the relatively slow propagation velocity and the attenuation of neuronal signals, we introduced "pseudo neurons" into our model to simulate simplified neuronal dynamics. The pseudo neurons also provide a good way of visualizing the nervous system's structure and activity dynamics.

秀丽隐杆线虫是唯一一种已知具有神经元接线图或“连接体”的动物。在过去的三十年中,对秀丽隐杆线虫的广泛研究提供了广泛的数据,但很少有系统的方法将这些数据整合到一个动态模型中。在这里,我们提出了一个虚拟秀丽隐杆线虫的详细演示,旨在以在模拟物理环境中操作的3D动态模型的形式整合这些数据。我们目前的演示包括一个现实的柔性蠕虫体模型,肌肉系统和部分实现的腹侧神经索。我们的虚拟秀丽隐杆线虫在向运动神经元群发送神经元活动的正弦模式时,展示了成功的向前和向后运动。考虑到相对较慢的传播速度和神经元信号的衰减,我们在模型中引入了“伪神经元”来模拟简化的神经元动力学。伪神经元也为神经系统的结构和活动动态可视化提供了一种很好的方法。
{"title":"Towards a virtual C. elegans: a framework for simulation and visualization of the neuromuscular system in a 3D physical environment.","authors":"Andrey Palyanov,&nbsp;Sergey Khayrulin,&nbsp;Stephen D Larson,&nbsp;Alexander Dibert","doi":"10.3233/ISB-2012-0445","DOIUrl":"https://doi.org/10.3233/ISB-2012-0445","url":null,"abstract":"<p><p>The nematode C. elegans is the only animal with a known neuronal wiring diagram, or \"connectome\". During the last three decades, extensive studies of the C. elegans have provided wide-ranging data about it, but few systematic ways of integrating these data into a dynamic model have been put forward. Here we present a detailed demonstration of a virtual C. elegans aimed at integrating these data in the form of a 3D dynamic model operating in a simulated physical environment. Our current demonstration includes a realistic flexible worm body model, muscular system and a partially implemented ventral neural cord. Our virtual C. elegans demonstrates successful forward and backward locomotion when sending sinusoidal patterns of neuronal activity to groups of motor neurons. To account for the relatively slow propagation velocity and the attenuation of neuronal signals, we introduced \"pseudo neurons\" into our model to simulate simplified neuronal dynamics. The pseudo neurons also provide a good way of visualizing the nervous system's structure and activity dynamics.</p>","PeriodicalId":39379,"journal":{"name":"In Silico Biology","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2011-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.3233/ISB-2012-0445","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"30870652","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 38
Hydrophobic tint of knot proteins 结蛋白的疏水性
Q2 Medicine Pub Date : 2010-02-15 DOI: 10.1145/1722024.1722034
P. Anto, S. N. Achuthsankar
Protein structures having knotted configurations in their native fold, have great impact in their function. Protein knot localization has become possible in single molecule experiments though they are identified in their structure level. Signal processing methods which have played an important role to analyse genomic and proteomic sequences are also useful for knot protein analysis. The amino acid index hydrophobicity contributes the knowledge of stability of proteins. Water capture and release is found to be controllable by the tightening force in knots which are related to this index. It is observed that, the knot proteins are of hydrophobic in nature by Fourier analysis, Power spectral density estimation and Cross correlation method. The set of knot proteins from proteinKNOT web server(pKNOT) has been used for the experimentation and proved 93% of them are of hydrophobic nature in their knotted core.
蛋白质结构在其天然折叠中具有打结构型,对其功能有很大影响。蛋白结定位虽然在结构水平上得到了鉴定,但在单分子实验中已成为可能。信号处理方法在基因组和蛋白质组学序列分析中发挥了重要作用,也可用于结蛋白分析。氨基酸疏水性指数有助于了解蛋白质的稳定性。发现水的捕获和释放是由与该指数有关的结的拧紧力控制的。通过傅里叶分析、功率谱密度估计和相互关系分析发现,结蛋白具有疏水性。利用proteinKNOT web server(pKNOT)上的结蛋白集进行实验,结果证明93%的结蛋白在其结核中具有疏水性。
{"title":"Hydrophobic tint of knot proteins","authors":"P. Anto, S. N. Achuthsankar","doi":"10.1145/1722024.1722034","DOIUrl":"https://doi.org/10.1145/1722024.1722034","url":null,"abstract":"Protein structures having knotted configurations in their native fold, have great impact in their function. Protein knot localization has become possible in single molecule experiments though they are identified in their structure level. Signal processing methods which have played an important role to analyse genomic and proteomic sequences are also useful for knot protein analysis. The amino acid index hydrophobicity contributes the knowledge of stability of proteins. Water capture and release is found to be controllable by the tightening force in knots which are related to this index. It is observed that, the knot proteins are of hydrophobic in nature by Fourier analysis, Power spectral density estimation and Cross correlation method. The set of knot proteins from proteinKNOT web server(pKNOT) has been used for the experimentation and proved 93% of them are of hydrophobic nature in their knotted core.","PeriodicalId":39379,"journal":{"name":"In Silico Biology","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2010-02-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1145/1722024.1722034","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"64107794","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Improving prediction of protein secondary structure using physicochemical properties of amino acids 利用氨基酸的理化性质改进蛋白质二级结构的预测
Q2 Medicine Pub Date : 2010-02-15 DOI: 10.1145/1722024.1722036
P. Chatterjee, Subhadip Basu, M. Nasipuri
Protein Structure Prediction is important in the sense that it helps to extend knowledge about the understanding of protein structures and functions. The knowledge is essential for prediction of secondary structures of unknown proteins required for applications related to drug discovery. A novel technique for protein secondary structure prediction is presented here. In this work, two levels of multi-layer feed forward neural networks are used. In the first level network, sequence profiles from PSI-BLAST and physicochemical properties of amino acids are used for sequence to structure predictions. Confidence values of forming helix, sheet and coil, obtained from the first level network are then used with the second level network for structure to structure predictions. The overall prediction accuracy as obtained through experimentation is in the range of 75.58% to 77.48%. This method is trained and tested with nrDSSP datasets using four folds cross validation. It is also tested on target proteins of Critical Assessment of Protein Structure Prediction Experiment (CASP3) and achieves better results than PSIPRED over some target proteins.
蛋白质结构预测很重要,因为它有助于扩展对蛋白质结构和功能的理解。这些知识对于预测与药物发现相关的应用所需的未知蛋白质的二级结构至关重要。本文提出了一种新的蛋白质二级结构预测方法。在这项工作中,使用了两层多层前馈神经网络。在第一级网络中,使用来自PSI-BLAST的序列剖面和氨基酸的物理化学性质进行序列结构预测。从第一级网络中获得的成形螺旋、板材和线圈的置信度值,然后与第二级网络一起用于结构对结构的预测。通过实验得到的总体预测精度在75.58% ~ 77.48%之间。该方法在nrDSSP数据集上进行了四倍交叉验证的训练和测试。并在蛋白结构预测关键评估实验(CASP3)的靶蛋白上进行了测试,在部分靶蛋白上取得了比PSIPRED更好的结果。
{"title":"Improving prediction of protein secondary structure using physicochemical properties of amino acids","authors":"P. Chatterjee, Subhadip Basu, M. Nasipuri","doi":"10.1145/1722024.1722036","DOIUrl":"https://doi.org/10.1145/1722024.1722036","url":null,"abstract":"Protein Structure Prediction is important in the sense that it helps to extend knowledge about the understanding of protein structures and functions. The knowledge is essential for prediction of secondary structures of unknown proteins required for applications related to drug discovery. A novel technique for protein secondary structure prediction is presented here. In this work, two levels of multi-layer feed forward neural networks are used. In the first level network, sequence profiles from PSI-BLAST and physicochemical properties of amino acids are used for sequence to structure predictions. Confidence values of forming helix, sheet and coil, obtained from the first level network are then used with the second level network for structure to structure predictions. The overall prediction accuracy as obtained through experimentation is in the range of 75.58% to 77.48%. This method is trained and tested with nrDSSP datasets using four folds cross validation. It is also tested on target proteins of Critical Assessment of Protein Structure Prediction Experiment (CASP3) and achieves better results than PSIPRED over some target proteins.","PeriodicalId":39379,"journal":{"name":"In Silico Biology","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2010-02-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1145/1722024.1722036","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"64107844","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Application of reactive GRASP to the biclustering of gene expression data 反应性GRASP在基因表达数据双聚类中的应用
Q2 Medicine Pub Date : 2010-02-15 DOI: 10.1145/1722024.1722041
Shyama Das, S. M. Idicula
A bicluster in gene expression dataset is a subset of genes that exhibit similar expression patterns through a subset of conditions. In this work biclusters are identified in two steps. In the first step high quality bicluster seeds are generated using KMeans clustering algorithm. These seeds are then enlarged using Reactive Greedy Randomized Adaptive Search Procedure (RGRASP) which is a multi-start metaheuristic method in which there are two phases, construction and local search. The objective here is to identify biclusters of maximum size with MSR lower than a given threshold. Experiments are conducted on both Yeast and Human Lymphoma datasets. The Experimental results on the benchmark datasets demonstrate that RGRASP is capable of identifying high quality biclusters compared to many of the already existing biclustering algorithms. Compared to the already existing algorithm based on the same RGRASP metaheuristics biclusters with larger size and lower mean squared residue are obtained using this algorithm in Yeast dataset. Moreover in this study the RGRASP is applied for the first time to find biclusters from the Human Lymphoma dataset.
基因表达数据集中的双聚类是通过一组条件表现出相似表达模式的基因子集。在本工作中,分两个步骤确定双聚类。第一步,使用KMeans聚类算法生成高质量的双聚类种子。然后使用反应贪婪随机自适应搜索程序(RGRASP)对这些种子进行扩展,RGRASP是一种多起点元启发式方法,其中有两个阶段:构建和局部搜索。这里的目标是识别MSR低于给定阈值的最大大小的双聚类。实验在酵母和人类淋巴瘤数据集上进行。在基准数据集上的实验结果表明,与许多现有的双聚类算法相比,RGRASP能够识别高质量的双聚类。与基于相同RGRASP元启发式的现有算法相比,该算法在酵母数据集上获得了更大尺寸和更低均方残差的双聚类。此外,在本研究中,RGRASP首次应用于从人类淋巴瘤数据集中发现双聚类。
{"title":"Application of reactive GRASP to the biclustering of gene expression data","authors":"Shyama Das, S. M. Idicula","doi":"10.1145/1722024.1722041","DOIUrl":"https://doi.org/10.1145/1722024.1722041","url":null,"abstract":"A bicluster in gene expression dataset is a subset of genes that exhibit similar expression patterns through a subset of conditions. In this work biclusters are identified in two steps. In the first step high quality bicluster seeds are generated using KMeans clustering algorithm. These seeds are then enlarged using Reactive Greedy Randomized Adaptive Search Procedure (RGRASP) which is a multi-start metaheuristic method in which there are two phases, construction and local search. The objective here is to identify biclusters of maximum size with MSR lower than a given threshold. Experiments are conducted on both Yeast and Human Lymphoma datasets. The Experimental results on the benchmark datasets demonstrate that RGRASP is capable of identifying high quality biclusters compared to many of the already existing biclustering algorithms. Compared to the already existing algorithm based on the same RGRASP metaheuristics biclusters with larger size and lower mean squared residue are obtained using this algorithm in Yeast dataset. Moreover in this study the RGRASP is applied for the first time to find biclusters from the Human Lymphoma dataset.","PeriodicalId":39379,"journal":{"name":"In Silico Biology","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2010-02-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1145/1722024.1722041","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"64107881","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Fuzzy pattern extraction for classification of protein sequences 模糊模式提取在蛋白质序列分类中的应用
Q2 Medicine Pub Date : 2010-02-15 DOI: 10.1145/1722024.1722046
Abhijit J. Kulkarni, A. Noronha, Sasanka Roy, S. Angadi
Text mining is an important research area in applied statistics. The present article addresses an important problem from the Bioinformatics field, viz. classification of protein sequences as soluble proteins and inclusion body forming proteins when over-expressed in Escherichia coli (E. coli), using text mining and machine learning techniques. We propose a text mining based algorithm to extract patterns from the protein sequences that are later used in support vector classification algorithm. We report the best classification results for this dataset compared to the existing state of the art. Our algorithm is quite general and can be applied to any biological text data. The extracted patterns may give further insight in underlying dynamics of the sequences that decide the corresponding class membership.
文本挖掘是应用统计学中的一个重要研究领域。本文利用文本挖掘和机器学习技术解决了生物信息学领域的一个重要问题,即在大肠杆菌(E. coli)中过表达时,将蛋白质序列分类为可溶性蛋白质和包涵体形成蛋白质。我们提出了一种基于文本挖掘的算法,从蛋白质序列中提取模式,然后用于支持向量分类算法。与现有技术相比,我们报告了该数据集的最佳分类结果。我们的算法非常通用,可以应用于任何生物文本数据。提取的模式可以进一步了解决定相应类隶属关系的序列的潜在动态。
{"title":"Fuzzy pattern extraction for classification of protein sequences","authors":"Abhijit J. Kulkarni, A. Noronha, Sasanka Roy, S. Angadi","doi":"10.1145/1722024.1722046","DOIUrl":"https://doi.org/10.1145/1722024.1722046","url":null,"abstract":"Text mining is an important research area in applied statistics. The present article addresses an important problem from the Bioinformatics field, viz. classification of protein sequences as soluble proteins and inclusion body forming proteins when over-expressed in Escherichia coli (E. coli), using text mining and machine learning techniques. We propose a text mining based algorithm to extract patterns from the protein sequences that are later used in support vector classification algorithm. We report the best classification results for this dataset compared to the existing state of the art. Our algorithm is quite general and can be applied to any biological text data. The extracted patterns may give further insight in underlying dynamics of the sequences that decide the corresponding class membership.","PeriodicalId":39379,"journal":{"name":"In Silico Biology","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2010-02-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1145/1722024.1722046","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"64107923","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Study of indole inhibitors to increase the affinity of hnps-PLA2 in inflammatory disease 吲哚抑制剂在炎性疾病中增加hnps-PLA2亲和力的研究
Q2 Medicine Pub Date : 2010-02-15 DOI: 10.1145/1722024.1722056
Amit Nagal, Swapnil R. Jaiswal, H. Yadav, M. Mohan, P. Ghosh
There are many Drug used in the treatment of inflammation disease like NSAIDS but there limitation encouraging more research in inflammatory related diseases. Phospholipases A2 (PLA2s) are enzymes that catalyze the hydrolysis of the sn-2 acyl ester linkage of phospholipids, producing fatty acids and lysophospholipids. Their enzymatic activity is a rate-limiting step in the formation of arachidonic acid and subsequently in the synthesis of leukotrienes and prostaglandins. The current Structure Based Drug Designing approach analysis and comparative docking studies of various hnps-PLA2 indole inhibitor derivatives have shown that they act better in compare with other molecules. ADME studies shows that indole derivatives would be potential of being a safe drug
有许多药物用于治疗炎症疾病,如非甾体抗炎药,但有限制,鼓励更多的研究炎症相关疾病。磷脂酶A2 (PLA2s)是催化磷脂的sn-2酰基酯链水解,产生脂肪酸和溶血磷脂的酶。它们的酶活性是花生四烯酸形成和随后合成白三烯和前列腺素的限速步骤。目前基于结构的药物设计方法对各种hnps-PLA2吲哚抑制剂衍生物的分析和比较对接研究表明,它们与其他分子相比具有更好的作用。ADME研究表明,吲哚衍生物有可能成为一种安全的药物
{"title":"Study of indole inhibitors to increase the affinity of hnps-PLA2 in inflammatory disease","authors":"Amit Nagal, Swapnil R. Jaiswal, H. Yadav, M. Mohan, P. Ghosh","doi":"10.1145/1722024.1722056","DOIUrl":"https://doi.org/10.1145/1722024.1722056","url":null,"abstract":"There are many Drug used in the treatment of inflammation disease like NSAIDS but there limitation encouraging more research in inflammatory related diseases. Phospholipases A2 (PLA2s) are enzymes that catalyze the hydrolysis of the sn-2 acyl ester linkage of phospholipids, producing fatty acids and lysophospholipids. Their enzymatic activity is a rate-limiting step in the formation of arachidonic acid and subsequently in the synthesis of leukotrienes and prostaglandins. The current Structure Based Drug Designing approach analysis and comparative docking studies of various hnps-PLA2 indole inhibitor derivatives have shown that they act better in compare with other molecules. ADME studies shows that indole derivatives would be potential of being a safe drug","PeriodicalId":39379,"journal":{"name":"In Silico Biology","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2010-02-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1145/1722024.1722056","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"64108413","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Finding motifs using harmony search 使用和声搜索寻找主题
Q2 Medicine Pub Date : 2010-02-15 DOI: 10.1145/1722024.1722072
Jyotshna Dongardive, Aarti Patil, A. Bir, S. Jamkhedkar, Siby Abraham
The paper proposes a novel methodology for finding motifs of biological data. It uses music inspired meta-heuristic optimization technique called harmony search to find motif. The model is based on randomly generated l-mers as the initial harmony memory. Pitch adjustment and random selection are used to generate new l-mers, which are adjudged by a specially defined objective function. The proposed method is experimentally validated using sequences of Human Papillomavirus strains obtained from accredited and authorized sources.
本文提出了一种寻找生物数据基序的新方法。它使用音乐启发的元启发式优化技术,即和声搜索来寻找母题。该模型基于随机生成的l-mer作为初始和声记忆。该算法采用音调调整和随机选择的方法生成新的l-mer,并通过一个特殊定义的目标函数对其进行判断。所提出的方法通过从认可和授权来源获得的人乳头瘤病毒株序列进行了实验验证。
{"title":"Finding motifs using harmony search","authors":"Jyotshna Dongardive, Aarti Patil, A. Bir, S. Jamkhedkar, Siby Abraham","doi":"10.1145/1722024.1722072","DOIUrl":"https://doi.org/10.1145/1722024.1722072","url":null,"abstract":"The paper proposes a novel methodology for finding motifs of biological data. It uses music inspired meta-heuristic optimization technique called harmony search to find motif. The model is based on randomly generated l-mers as the initial harmony memory. Pitch adjustment and random selection are used to generate new l-mers, which are adjudged by a specially defined objective function. The proposed method is experimentally validated using sequences of Human Papillomavirus strains obtained from accredited and authorized sources.","PeriodicalId":39379,"journal":{"name":"In Silico Biology","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2010-02-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1145/1722024.1722072","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"64108497","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Biclustering gene expression data using KMeans-binary PSO hybrid 使用kmeans -二进制PSO杂交对基因表达数据进行双聚类
Q2 Medicine Pub Date : 2010-02-15 DOI: 10.1145/1722024.1722074
Shyama Das, S. M. Idicula
Biclustering is a very useful data mining technique which identifies coherent patterns from microarray gene expression data. A bicluster of a gene expression dataset is a subset of genes which exhibit similar expression patterns along a subset of conditions. Biclustering is a powerful analytical tool for the biologist and has generated considerable interest over the past few decades. The problem of locating the most significant biclusters in gene expression data has shown to be NP complete. In this paper a PSO based algorithm is developed for biclustering gene expression data. This algorithm has three steps. In the first step high quality bicluster seeds are generated using KMeans clustering algorithm. From these seeds biclusters are generated using particle swarm optimization. In the third stage an iterative search is performed to check the possibility of adding more genes and conditions within the given threshold value of mean squared residue score. Experimental results on real datasets show that our approach can effectively find high quality biclusters.
双聚类是一种非常有用的数据挖掘技术,可以从微阵列基因表达数据中识别出一致的模式。基因表达数据集的双聚类是沿条件子集表现出相似表达模式的基因子集。对于生物学家来说,双聚类是一种强大的分析工具,在过去的几十年里引起了相当大的兴趣。定位基因表达数据中最重要的双聚类的问题已被证明是NP完全的。本文提出了一种基于粒子群算法的基因表达数据双聚类算法。该算法分为三个步骤。第一步,使用KMeans聚类算法生成高质量的双聚类种子。利用粒子群算法从这些种子中生成双聚类。在第三阶段,进行迭代搜索,以检查在给定的均方残差评分阈值内添加更多基因和条件的可能性。在真实数据集上的实验结果表明,该方法可以有效地找到高质量的双聚类。
{"title":"Biclustering gene expression data using KMeans-binary PSO hybrid","authors":"Shyama Das, S. M. Idicula","doi":"10.1145/1722024.1722074","DOIUrl":"https://doi.org/10.1145/1722024.1722074","url":null,"abstract":"Biclustering is a very useful data mining technique which identifies coherent patterns from microarray gene expression data. A bicluster of a gene expression dataset is a subset of genes which exhibit similar expression patterns along a subset of conditions. Biclustering is a powerful analytical tool for the biologist and has generated considerable interest over the past few decades. The problem of locating the most significant biclusters in gene expression data has shown to be NP complete. In this paper a PSO based algorithm is developed for biclustering gene expression data. This algorithm has three steps. In the first step high quality bicluster seeds are generated using KMeans clustering algorithm. From these seeds biclusters are generated using particle swarm optimization. In the third stage an iterative search is performed to check the possibility of adding more genes and conditions within the given threshold value of mean squared residue score. Experimental results on real datasets show that our approach can effectively find high quality biclusters.","PeriodicalId":39379,"journal":{"name":"In Silico Biology","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2010-02-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1145/1722024.1722074","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"64108581","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Cis regulatory module discovery in immune cell development 免疫细胞发育中Cis调控模块的发现
Q2 Medicine Pub Date : 2010-02-15 DOI: 10.1145/1722024.1722039
S. R. Ganakammal, M. Kaplan, N. Perumal
Transcriptional regulatory mechanisms are mediated by a set of transcription factors (TFs), which bind to a specific region (motifs or transcription factor binding sites, TFBS), on the target gene(s) leading to gene expression. Eukaryotic regulatory motifs, referred to as cis regulatory modules (CRMs), tend to co-occur near the regulated gene's transcription start site and provide the building blocks to transcriptional regulatory networks that model the relevant TF-TFBS interactions. Here, we study IL-12 stimulated transcriptional regulators in STAT4 mediated T helper 1 (Th1) cell development by focusing on the identification of TFBS and CRMs using a set of Stat4 ChIP-on-chip target genes. A region containing 2000 bases of Mus musculus sequences with the Stat4 binding site, derived from the ChIP-on-chip data, has been characterized for enrichment of other motifs and, thus CRMs. We find two such motifs, (NF-κB and PPARγ/RXR) being enriched in the Stat4 binding sequences compared to neighboring background sequences and sets of random sequences of equal size. Furthermore, these predicted CRMs were observed to be associated with biologically relevant target genes in the ChIP-on-chip data set by meaningful gene ontology annotations. These analyses will lead to a better understanding of transcriptional regulatory networks in IL-12 stimulated Stat4 mediated Th1 cell differentiation.
转录调控机制是由一组转录因子(TFs)介导的,这些转录因子结合到靶基因上的特定区域(基序或转录因子结合位点,TFBS),导致基因表达。真核生物调控基序,被称为顺式调控模块(CRMs),倾向于在被调控基因的转录起始位点附近共同发生,并为模拟相关TF-TFBS相互作用的转录调控网络提供构建块。在这里,我们通过使用一组STAT4 ChIP-on-chip靶基因鉴定TFBS和CRMs,研究IL-12刺激的转录调控因子在STAT4介导的T辅助1 (Th1)细胞发育中的作用。一个包含2000个碱基的小家鼠Stat4结合位点的区域,从ChIP-on-chip数据中得到,已经被表征为其他基序的富集,因此是CRMs。我们发现两个这样的基序(NF-κB和PPARγ/RXR)在Stat4结合序列中比邻近的背景序列和相同大小的随机序列更丰富。此外,通过有意义的基因本体注释,观察到这些预测的crm与ChIP-on-chip数据集中的生物学相关靶基因相关。这些分析将有助于更好地理解IL-12刺激的Stat4介导的Th1细胞分化的转录调控网络。
{"title":"Cis regulatory module discovery in immune cell development","authors":"S. R. Ganakammal, M. Kaplan, N. Perumal","doi":"10.1145/1722024.1722039","DOIUrl":"https://doi.org/10.1145/1722024.1722039","url":null,"abstract":"Transcriptional regulatory mechanisms are mediated by a set of transcription factors (TFs), which bind to a specific region (motifs or transcription factor binding sites, TFBS), on the target gene(s) leading to gene expression. Eukaryotic regulatory motifs, referred to as cis regulatory modules (CRMs), tend to co-occur near the regulated gene's transcription start site and provide the building blocks to transcriptional regulatory networks that model the relevant TF-TFBS interactions. Here, we study IL-12 stimulated transcriptional regulators in STAT4 mediated T helper 1 (Th1) cell development by focusing on the identification of TFBS and CRMs using a set of Stat4 ChIP-on-chip target genes. A region containing 2000 bases of Mus musculus sequences with the Stat4 binding site, derived from the ChIP-on-chip data, has been characterized for enrichment of other motifs and, thus CRMs. We find two such motifs, (NF-κB and PPARγ/RXR) being enriched in the Stat4 binding sequences compared to neighboring background sequences and sets of random sequences of equal size. Furthermore, these predicted CRMs were observed to be associated with biologically relevant target genes in the ChIP-on-chip data set by meaningful gene ontology annotations. These analyses will lead to a better understanding of transcriptional regulatory networks in IL-12 stimulated Stat4 mediated Th1 cell differentiation.","PeriodicalId":39379,"journal":{"name":"In Silico Biology","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2010-02-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1145/1722024.1722039","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"64107849","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Complete enumeration of compact structural motifs in proteins 在蛋白质中紧凑结构基序的完整枚举
Q2 Medicine Pub Date : 2010-02-15 DOI: 10.1145/1722024.1722047
Bhadrachalam Chitturi, D. Bein, N. Grishin
The search of structural motifs that specify the spatial arrangement of polypeptide segments is preferred over other methods such as common substructure discovery and structural superposition in comparing protein structures. 3D protein structures can be modeled as graphs whose maximum degree is bounded by a constant. Structural motifs can also be modeled as graphs and a significant percentage of them are trees. Thus, motif search in proteins can be modeled as an enumeration of isomorphic subgraphs where a query tree Q with m nodes is searched in a sparse graph G with n nodes and the maximum degree of any node in G is bounded by a constant ε. We design an efficient divide-and-conquer algorithm that finds all copies of Q in G by partitioning Q using a minimum dominating set. This strategy can be extended to sparse query graphs that can be reduced to trees by deleting a small number of edges.
在比较蛋白质结构时,寻找指定多肽片段空间排列的结构基序优于其他方法,如共同亚结构发现和结构叠加。三维蛋白质结构可以建模为图形,其最大程度由常数限定。结构图案也可以建模为图形,其中很大一部分是树。因此,蛋白质中的基序搜索可以建模为同构子图的枚举,其中具有m个节点的查询树Q在具有n个节点的稀疏图G中搜索,并且G中任何节点的最大度以常数ε为界。我们设计了一个有效的分治算法,该算法通过使用最小支配集划分Q来找到G中Q的所有副本。该策略可以扩展到稀疏查询图,通过删除少量边可以将其简化为树。
{"title":"Complete enumeration of compact structural motifs in proteins","authors":"Bhadrachalam Chitturi, D. Bein, N. Grishin","doi":"10.1145/1722024.1722047","DOIUrl":"https://doi.org/10.1145/1722024.1722047","url":null,"abstract":"The search of structural motifs that specify the spatial arrangement of polypeptide segments is preferred over other methods such as common substructure discovery and structural superposition in comparing protein structures. 3D protein structures can be modeled as graphs whose maximum degree is bounded by a constant. Structural motifs can also be modeled as graphs and a significant percentage of them are trees. Thus, motif search in proteins can be modeled as an enumeration of isomorphic subgraphs where a query tree Q with m nodes is searched in a sparse graph G with n nodes and the maximum degree of any node in G is bounded by a constant ε. We design an efficient divide-and-conquer algorithm that finds all copies of Q in G by partitioning Q using a minimum dominating set. This strategy can be extended to sparse query graphs that can be reduced to trees by deleting a small number of edges.","PeriodicalId":39379,"journal":{"name":"In Silico Biology","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2010-02-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1145/1722024.1722047","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"64108140","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
期刊
In Silico Biology
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1