首页 > 最新文献

Computers & chemistry最新文献

英文 中文
Predicting function from structure: examples of the serine protease inhibitor canonical loop conformation found in extracellular proteins 从结构预测功能:在细胞外蛋白中发现的丝氨酸蛋白酶抑制剂典型环构象的例子
Pub Date : 2001-12-01 DOI: 10.1016/S0097-8485(01)00097-3
Richard M Jackson , Robert B Russell

The prediction of protein function from structure is becoming of growing importance in the age of structural genomics. We have focused on the problem of identifying sites of potential serine protease inhibitor interactions on the surface of proteins of known structure. Given that there is no sequence conservation within canonical loops from different inhibitor families we first compare representative loops to all fragments of equal length among proteins of known structure by calculating main-chain RMS deviation. Fragments with RMS deviation below a certain threshold (hits) are removed if residues have solvent accessibilities appreciably lower than those observed in the search structure. These remaining hits are further filtered to remove those occurring largely within secondary structure elements. Likely functional significance is restricted further by considering only extracellular protein domains. Also a test is performed to see if the loop can dock into the binding site of the serine protease trypsin without unacceptable steric clashes. By comparing different canonical loop structures to the protein structure database we show that the method was able to detect previously known inhibitors. In addition, we discuss potentially new canonical loop structures found in secreted hydrolases, toxins, viral proteins, cytokines and other proteins. We discuss the possible functional significance of several of the examples found.

在结构基因组学时代,从结构上预测蛋白质的功能变得越来越重要。我们关注的问题是在已知结构的蛋白质表面识别潜在丝氨酸蛋白酶抑制剂相互作用的位点。考虑到不同抑制剂家族的典型环内不存在序列保守性,我们首先通过计算主链RMS偏差,将代表性环与已知结构的蛋白质中所有长度相等的片段进行比较。如果残留物的溶剂可及性明显低于在搜索结构中观察到的,则RMS偏差低于某一阈值(命中数)的片段被移除。这些剩余的命中将被进一步过滤,以去除那些主要发生在二级结构元素中的命中。由于只考虑细胞外蛋白结构域,可能的功能意义进一步受到限制。此外,还进行了一项测试,以确定该环是否可以停靠到丝氨酸蛋白酶胰蛋白酶的结合位点,而不会出现不可接受的空间冲突。通过将不同的典型环结构与蛋白质结构数据库进行比较,我们表明该方法能够检测到以前已知的抑制剂。此外,我们还讨论了在分泌的水解酶、毒素、病毒蛋白、细胞因子和其他蛋白质中发现的潜在的新典型环结构。我们将讨论所发现的几个例子可能的功能意义。
{"title":"Predicting function from structure: examples of the serine protease inhibitor canonical loop conformation found in extracellular proteins","authors":"Richard M Jackson ,&nbsp;Robert B Russell","doi":"10.1016/S0097-8485(01)00097-3","DOIUrl":"10.1016/S0097-8485(01)00097-3","url":null,"abstract":"<div><p>The prediction of protein function from structure is becoming of growing importance in the age of structural genomics. We have focused on the problem of identifying sites of potential serine protease inhibitor interactions on the surface of proteins of known structure. Given that there is no sequence conservation within canonical loops from different inhibitor families we first compare representative loops to all fragments of equal length among proteins of known structure by calculating main-chain RMS deviation. Fragments with RMS deviation below a certain threshold (hits) are removed if residues have solvent accessibilities appreciably lower than those observed in the search structure. These remaining hits are further filtered to remove those occurring largely within secondary structure elements. Likely functional significance is restricted further by considering only extracellular protein domains. Also a test is performed to see if the loop can dock into the binding site of the serine protease trypsin without unacceptable steric clashes. By comparing different canonical loop structures to the protein structure database we show that the method was able to detect previously known inhibitors. In addition, we discuss potentially new canonical loop structures found in secreted hydrolases, toxins, viral proteins, cytokines and other proteins. We discuss the possible functional significance of several of the examples found.</p></div>","PeriodicalId":79331,"journal":{"name":"Computers & chemistry","volume":"26 1","pages":"Pages 31-39"},"PeriodicalIF":0.0,"publicationDate":"2001-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/S0097-8485(01)00097-3","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75323839","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Modelling protein side-chain conformations using constraint logic programming 用约束逻辑编程建模蛋白质侧链构象
Pub Date : 2001-12-01 DOI: 10.1016/S0097-8485(01)00103-6
Martin T Swain , Graham J.L Kemp

Side-chain placement is an important sub-task in protein modelling. Selecting conformations for side-chains is a difficult problem because of the large search space to be explored. This problem can be addressed using constraint logic programming (CLP), which is an artificial intelligence technique developed to solve large combinatorial search problems. The side-chain placement problem can be expressed as a CLP program in which rotamer conformations are used as values for finite domain variables, and bad steric contacts involving rotamers are represented as constraints. This paper introduces the concept of null rotamers, and shows how these can be used in implementing a novel iterative approach. We present results that compare the accuracy of models constructed using different rotamer libraries and different domain variable enumeration heuristics. The results obtained using this CLP-based approach compare favourably with those obtained by other methods.

侧链放置是蛋白质建模中一项重要的子任务。侧链的构象选择是一个难题,因为需要探索的搜索空间很大。这个问题可以使用约束逻辑规划(CLP)来解决,这是一种用于解决大型组合搜索问题的人工智能技术。侧链放置问题可以表示为一个CLP程序,其中以转子构象作为有限域变量的值,并将涉及转子的不良空间接触表示为约束。本文介绍了零转子的概念,并展示了如何将它们用于实现一种新的迭代方法。我们给出的结果比较了使用不同的旋转体库和不同的域变量枚举启发式构建的模型的准确性。使用这种基于clp的方法获得的结果与其他方法获得的结果比较有利。
{"title":"Modelling protein side-chain conformations using constraint logic programming","authors":"Martin T Swain ,&nbsp;Graham J.L Kemp","doi":"10.1016/S0097-8485(01)00103-6","DOIUrl":"10.1016/S0097-8485(01)00103-6","url":null,"abstract":"<div><p>Side-chain placement is an important sub-task in protein modelling. Selecting conformations for side-chains is a difficult problem because of the large search space to be explored. This problem can be addressed using constraint logic programming (CLP), which is an artificial intelligence technique developed to solve large combinatorial search problems. The side-chain placement problem can be expressed as a CLP program in which rotamer conformations are used as values for <em>finite domain variables</em>, and bad steric contacts involving rotamers are represented as <em>constraints</em>. This paper introduces the concept of null rotamers, and shows how these can be used in implementing a novel iterative approach. We present results that compare the accuracy of models constructed using different rotamer libraries and different domain variable enumeration heuristics. The results obtained using this CLP-based approach compare favourably with those obtained by other methods.</p></div>","PeriodicalId":79331,"journal":{"name":"Computers & chemistry","volume":"26 1","pages":"Pages 85-95"},"PeriodicalIF":0.0,"publicationDate":"2001-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/S0097-8485(01)00103-6","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77829236","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
AI-based algorithms for protein surface comparisons 基于人工智能的蛋白质表面比较算法
Pub Date : 2001-12-01 DOI: 10.1016/S0097-8485(01)00102-4
Steven J Pickering , Andrew J Bulpitt , Nick Efford , Nicola D Gold , David R Westhead

Many current methods for protein analysis depend on the detection of similarity in either the primary sequence, or the overall tertiary structure (the Cα atoms of the protein backbone). These common sequences or structures may imply similar functional characteristics or active properties. Active sites and ligand binding sites usually occur on or near the surface of the protein; so similarly shaped surface regions could imply similar functions. We investigate various methods for describing the shape properties of protein surfaces and for comparing them. Our current work uses algorithms from computer vision to describe the protein surfaces, and methods from graph theory to compare the surface regions. Early results indicate that we can successfully match a family of related ligand binding sites, and find their similarly shaped surface regions. This method of surface analysis could be extended to help identify unknown surface regions for possible ligand binding or active sites.

目前许多蛋白质分析方法依赖于一级序列或整体三级结构(蛋白质主链的Cα原子)的相似性检测。这些共同的序列或结构可能意味着相似的功能特征或活性特性。活性位点和配体结合位点通常位于蛋白质表面或其附近;所以相似形状的表面区域可能意味着相似的功能。我们研究了描述蛋白质表面形状特性的各种方法,并对它们进行了比较。我们目前的工作使用计算机视觉算法来描述蛋白质表面,并使用图论方法来比较表面区域。早期结果表明,我们可以成功匹配一个相关配体结合位点家族,并找到它们相似形状的表面区域。这种表面分析方法可以扩展到帮助识别未知的表面区域,以确定可能的配体结合或活性位点。
{"title":"AI-based algorithms for protein surface comparisons","authors":"Steven J Pickering ,&nbsp;Andrew J Bulpitt ,&nbsp;Nick Efford ,&nbsp;Nicola D Gold ,&nbsp;David R Westhead","doi":"10.1016/S0097-8485(01)00102-4","DOIUrl":"10.1016/S0097-8485(01)00102-4","url":null,"abstract":"<div><p>Many current methods for protein analysis depend on the detection of similarity in either the primary sequence, or the overall tertiary structure (the C<sub>α</sub> atoms of the protein backbone). These common sequences or structures may imply similar functional characteristics or active properties. Active sites and ligand binding sites usually occur on or near the surface of the protein; so similarly shaped surface regions could imply similar functions. We investigate various methods for describing the shape properties of protein surfaces and for comparing them. Our current work uses algorithms from computer vision to describe the protein surfaces, and methods from graph theory to compare the surface regions. Early results indicate that we can successfully match a family of related ligand binding sites, and find their similarly shaped surface regions. This method of surface analysis could be extended to help identify unknown surface regions for possible ligand binding or active sites.</p></div>","PeriodicalId":79331,"journal":{"name":"Computers & chemistry","volume":"26 1","pages":"Pages 79-84"},"PeriodicalIF":0.0,"publicationDate":"2001-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/S0097-8485(01)00102-4","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83359745","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 22
Artificial Intelligence in Bioinformatics 生物信息学中的人工智能
Pub Date : 2001-12-01 DOI: 10.1016/S0097-8485(01)00104-8
Dave W Corne , Andrew C.R Martin
{"title":"Artificial Intelligence in Bioinformatics","authors":"Dave W Corne ,&nbsp;Andrew C.R Martin","doi":"10.1016/S0097-8485(01)00104-8","DOIUrl":"10.1016/S0097-8485(01)00104-8","url":null,"abstract":"","PeriodicalId":79331,"journal":{"name":"Computers & chemistry","volume":"26 1","pages":"Pages 1-3"},"PeriodicalIF":0.0,"publicationDate":"2001-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/S0097-8485(01)00104-8","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84201406","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Application of a time-delay neural network to promoter annotation in the Drosophila melanogaster genome 延时神经网络在果蝇基因组启动子注释中的应用
Pub Date : 2001-12-01 DOI: 10.1016/S0097-8485(01)00099-7
Martin G Reese

Computational methods for automated genome annotation are critical to understanding and interpreting the bewildering mass of genomic sequence data presently being generated and released. A neural network model of the structural and compositional properties of a eukaryotic core promoter region has been developed and its application for analysis of the Drosophila melanogaster genome is presented. The model uses a time-delay architecture, a special case of a feed-forward neural network. The structure of this model allows for variable spacing between functional binding sites, which is known to play a key role in the transcription initiation process. Application of this model to a test set of core promoters not only gave better discrimination of potential promoter sites than previous statistical or neural network models, but also revealed indirectly subtle properties of the transcription initiation signal. When tested in the Adh region of 2.9 Mbases of the Drosophila genome, the neural network for promoter prediction (nnpp) program that incorporates the time-delay neural network model gives a recognition rate of 75% (69/92) with a false positive rate of 1/547 bases. The present work can be regarded as one of the first intensive studies that applies novel gene regulation technologies to the identification of the complex gene regulation sites in the genome of Drosophila melanogaster.

自动基因组注释的计算方法对于理解和解释目前正在生成和发布的令人眼花缭乱的大量基因组序列数据至关重要。建立了一种真核核心启动子区结构和组成特性的神经网络模型,并将其应用于黑腹果蝇基因组的分析。该模型采用了一种特殊的前馈神经网络——时滞结构。该模型的结构允许功能结合位点之间的可变间距,已知这在转录起始过程中起关键作用。将该模型应用于核心启动子测试集,不仅比以往的统计或神经网络模型更好地识别潜在启动子位点,而且还间接揭示了转录起始信号的微妙特性。在果蝇基因组2.9 m个碱基的Adh区域进行测试时,结合时滞神经网络模型的神经网络启动子预测(nnpp)程序的识别率为75%(69/92),假阳性率为1/547碱基。本研究是将新型基因调控技术应用于黑腹果蝇基因组复杂基因调控位点鉴定的首次深入研究之一。
{"title":"Application of a time-delay neural network to promoter annotation in the Drosophila melanogaster genome","authors":"Martin G Reese","doi":"10.1016/S0097-8485(01)00099-7","DOIUrl":"10.1016/S0097-8485(01)00099-7","url":null,"abstract":"<div><p>Computational methods for automated genome annotation are critical to understanding and interpreting the bewildering mass of genomic sequence data presently being generated and released. A neural network model of the structural and compositional properties of a eukaryotic core promoter region has been developed and its application for analysis of the <em>Drosophila melanogaster</em> genome is presented. The model uses a time-delay architecture, a special case of a feed-forward neural network. The structure of this model allows for variable spacing between functional binding sites, which is known to play a key role in the transcription initiation process. Application of this model to a test set of core promoters not only gave better discrimination of potential promoter sites than previous statistical or neural network models, but also revealed indirectly subtle properties of the transcription initiation signal. When tested in the <em>Adh</em> region of 2.9 Mbases of the <em>Drosophila</em> genome, the neural network for promoter prediction (<span>nnpp</span>) program that incorporates the time-delay neural network model gives a recognition rate of 75% (69/92) with a false positive rate of 1/547 bases. The present work can be regarded as one of the first intensive studies that applies novel gene regulation technologies to the identification of the complex gene regulation sites in the genome of <em>Drosophila melanogaster</em>.</p></div>","PeriodicalId":79331,"journal":{"name":"Computers & chemistry","volume":"26 1","pages":"Pages 51-56"},"PeriodicalIF":0.0,"publicationDate":"2001-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/S0097-8485(01)00099-7","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81750853","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 820
Applications of neural network prediction of conformational states for small peptides from spectra and of fold classes 从光谱和折叠类预测小肽构象状态的神经网络应用
Pub Date : 2001-12-01 DOI: 10.1016/S0097-8485(01)00101-2
H.G Bohr , P Røgen , K.J Jalkanen

Electronic structures of small peptides were calculated ‘ab initio’ with the help of Density Functional Theory (DFT) and molecular dynamics that rendered a set of conformational states of the peptides. For the structures of these states it was possible to derive atomic polar tensors that allowed us to construct vibrational spectra for each of the conformational states with low energy. From the spectra, neural networks could be trained to distinguish between the various states and thus be able to generate a larger set of relevant structures and their relation to secondary structures of the peptides. The calculations were done both with solvent atoms (up to ten water molecules) and without, and hence the neural networks could be used to monitor the influence of the solvent on hydrogen bond formation. The calculations at this stage only involved very short peptide fragments of a few alanine amino acids but already at this stage they could be compared with reasonable agreements to experiments. The neural networks are shown to be good in distinguishing the different conformers of the small alanine peptides, especially when in the gas phase. Also the task of predicting protein fold-classes, defined from line-geometry, seems promising.

在密度泛函理论(DFT)和分子动力学的帮助下,“从头”计算了小肽的电子结构,得到了肽的一组构象状态。对于这些态的结构,我们可以推导出原子极性张量,从而可以为每一个低能构象态构建振动谱。从光谱中,可以训练神经网络来区分不同的状态,从而能够生成更大的相关结构及其与肽二级结构的关系。计算是在有溶剂原子(最多10个水分子)和没有溶剂原子的情况下进行的,因此神经网络可以用来监测溶剂对氢键形成的影响。在这个阶段的计算只涉及一些非常短的丙氨酸氨基酸的肽片段,但在这个阶段,它们已经可以与实验的合理一致进行比较。神经网络在区分小丙氨酸肽的不同构象方面表现良好,特别是在气相时。此外,从线几何定义的预测蛋白质折叠类的任务似乎很有希望。
{"title":"Applications of neural network prediction of conformational states for small peptides from spectra and of fold classes","authors":"H.G Bohr ,&nbsp;P Røgen ,&nbsp;K.J Jalkanen","doi":"10.1016/S0097-8485(01)00101-2","DOIUrl":"10.1016/S0097-8485(01)00101-2","url":null,"abstract":"<div><p>Electronic structures of small peptides were calculated ‘ab initio’ with the help of Density Functional Theory (DFT) and molecular dynamics that rendered a set of conformational states of the peptides. For the structures of these states it was possible to derive atomic polar tensors that allowed us to construct vibrational spectra for each of the conformational states with low energy. From the spectra, neural networks could be trained to distinguish between the various states and thus be able to generate a larger set of relevant structures and their relation to secondary structures of the peptides. The calculations were done both with solvent atoms (up to ten water molecules) and without, and hence the neural networks could be used to monitor the influence of the solvent on hydrogen bond formation. The calculations at this stage only involved very short peptide fragments of a few alanine amino acids but already at this stage they could be compared with reasonable agreements to experiments. The neural networks are shown to be good in distinguishing the different conformers of the small alanine peptides, especially when in the gas phase. Also the task of predicting protein fold-classes, defined from line-geometry, seems promising.</p></div>","PeriodicalId":79331,"journal":{"name":"Computers & chemistry","volume":"26 1","pages":"Pages 65-77"},"PeriodicalIF":0.0,"publicationDate":"2001-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/S0097-8485(01)00101-2","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73285226","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
A computer system to perform structure comparison using TOPS representations of protein structure 一种利用蛋白质结构的TOPS表示进行结构比较的计算机系统
Pub Date : 2001-12-01 DOI: 10.1016/S0097-8485(01)00096-1
David Gilbert , David Westhead , Juris Viksna , Janet Thornton

We describe the design and implementation of a fast topology-based method for protein structure comparison. The approach uses the TOPS topological representation of protein structure, aligning two structures using a common discovered pattern and generating measure of distance derived from an insert score. Heavy use is made of a constraint-based pattern-matching algorithm for TOPS diagrams that we have designed and described elsewhere (Bioinformatics 15(4) (1999) 317). The comparison system is maintained at the European Bioinformatics Institute and is available over the Web at tops.ebi.ac.uk/tops. Users submit a structure description in Protein Data Bank (PDB) format and can compare it with structures in the entire PDB or a representative subset of protein domains, receiving the results by email.

我们描述了一种快速的基于拓扑的蛋白质结构比较方法的设计和实现。该方法使用蛋白质结构的TOPS拓扑表示,使用共同发现的模式对齐两个结构,并从插入分数中生成距离度量。对于我们在其他地方设计和描述的TOPS图,大量使用了基于约束的模式匹配算法(生物信息学15(4)(1999)317)。该比较系统由欧洲生物信息学研究所维护,可在tops.ebi.ac.uk/tops网站上获得。用户以蛋白质数据库(PDB)格式提交结构描述,并可以将其与整个PDB中的结构或蛋白质结构域的代表性子集进行比较,并通过电子邮件接收结果。
{"title":"A computer system to perform structure comparison using TOPS representations of protein structure","authors":"David Gilbert ,&nbsp;David Westhead ,&nbsp;Juris Viksna ,&nbsp;Janet Thornton","doi":"10.1016/S0097-8485(01)00096-1","DOIUrl":"10.1016/S0097-8485(01)00096-1","url":null,"abstract":"<div><p>We describe the design and implementation of a fast topology-based method for protein structure comparison. The approach uses the <span>TOPS</span> topological representation of protein structure, aligning two structures using a common discovered pattern and generating measure of distance derived from an insert score. Heavy use is made of a constraint-based pattern-matching algorithm for <span>TOPS</span> diagrams that we have designed and described elsewhere (Bioinformatics 15(4) (1999) 317). The comparison system is maintained at the European Bioinformatics Institute and is available over the Web at <span>tops.ebi.ac.uk/tops</span><svg><path></path></svg>. Users submit a structure description in Protein Data Bank (PDB) format and can compare it with structures in the entire PDB or a representative subset of protein domains, receiving the results by email.</p></div>","PeriodicalId":79331,"journal":{"name":"Computers & chemistry","volume":"26 1","pages":"Pages 23-30"},"PeriodicalIF":0.0,"publicationDate":"2001-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/S0097-8485(01)00096-1","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75169718","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 32
Generating protein three-dimensional fold signatures using inductive logic programming 利用归纳逻辑编程生成蛋白质三维折叠特征
Pub Date : 2001-12-01 DOI: 10.1016/S0097-8485(01)00100-0
M Turcotte , S.H Muggleton , M.J.E Sternberg

Inductive logic programming (ILP) has been applied to automatically discover protein fold signatures. This paper investigates the use of topological information to circumvent problems encountered during previous experiments, namely (1) matching of non-structurally related secondary structures and (2) scaling problems. Cross-validation tests were carried out for 20 folds. The overall estimated accuracy is 73.37±0.35%. The new representation allows us to process the complete set of examples, while previously it was necessary to sample the negative examples. Topological information is used in approximately 90% of the rules presented here. Information about the topology of a sheet is present in 63% of the rules. This set of rules presents characteristics of the overall architecture of the fold. In contrast, 26% of the rules contain topological information which is limited to the packing of a restricted number of secondary structures, as such, the later set resembles those found in our previous studies.

归纳逻辑编程(ILP)已被应用于蛋白质折叠特征的自动发现。本文研究了利用拓扑信息来规避先前实验中遇到的问题,即(1)非结构相关二级结构的匹配问题和(2)缩放问题。对20个折叠进行交叉验证试验。总体估计精度为73.37±0.35%。新的表示允许我们处理完整的示例集,而以前必须对负示例进行采样。本文介绍的规则中大约90%使用了拓扑信息。关于工作表拓扑结构的信息出现在63%的规则中。这组规则呈现了褶皱整体架构的特征。相比之下,26%的规则包含拓扑信息,这些信息仅限于有限数量的二级结构的包装,因此,后一组类似于我们之前研究中发现的那些。
{"title":"Generating protein three-dimensional fold signatures using inductive logic programming","authors":"M Turcotte ,&nbsp;S.H Muggleton ,&nbsp;M.J.E Sternberg","doi":"10.1016/S0097-8485(01)00100-0","DOIUrl":"10.1016/S0097-8485(01)00100-0","url":null,"abstract":"<div><p>Inductive logic programming (ILP) has been applied to automatically discover protein fold signatures. This paper investigates the use of topological information to circumvent problems encountered during previous experiments, namely (1) matching of non-structurally related secondary structures and (2) scaling problems. Cross-validation tests were carried out for 20 folds. The overall estimated accuracy is 73.37±0.35%. The new representation allows us to process the complete set of examples, while previously it was necessary to sample the negative examples. Topological information is used in approximately 90% of the rules presented here. Information about the topology of a sheet is present in 63% of the rules. This set of rules presents characteristics of the overall architecture of the fold. In contrast, 26% of the rules contain topological information which is limited to the packing of a restricted number of secondary structures, as such, the later set resembles those found in our previous studies.</p></div>","PeriodicalId":79331,"journal":{"name":"Computers & chemistry","volume":"26 1","pages":"Pages 57-64"},"PeriodicalIF":0.0,"publicationDate":"2001-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/S0097-8485(01)00100-0","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79924315","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Medical target prediction from genome sequence: combining different sequence analysis algorithms with expert knowledge and input from artificial intelligence approaches 从基因组序列预测医学靶标:将不同的序列分析算法与专家知识和人工智能方法的输入相结合
Pub Date : 2001-12-01 DOI: 10.1016/S0097-8485(01)00095-X
Thomas Dandekar , Fuli Du , R.Heiner Schirmer , Steffen Schmidt

By exploiting the rapid increase in available sequence data, the definition of medically relevant protein targets has been improved by a combination of: (i) differential genome analysis (target list); and (ii) analysis of individual proteins (target analysis). Fast sequence comparisons, data mining, and genetic algorithms further promote these procedures. Mycobacterium tuberculosis proteins were chosen as applied examples.

利用现有序列数据的迅速增加,通过以下结合改进了医学相关蛋白质靶点的定义:(i)差异基因组分析(靶点列表);(ii)单个蛋白质的分析(靶分析)。快速序列比较、数据挖掘和遗传算法进一步促进了这些过程。以结核分枝杆菌蛋白为应用实例。
{"title":"Medical target prediction from genome sequence: combining different sequence analysis algorithms with expert knowledge and input from artificial intelligence approaches","authors":"Thomas Dandekar ,&nbsp;Fuli Du ,&nbsp;R.Heiner Schirmer ,&nbsp;Steffen Schmidt","doi":"10.1016/S0097-8485(01)00095-X","DOIUrl":"10.1016/S0097-8485(01)00095-X","url":null,"abstract":"<div><p>By exploiting the rapid increase in available sequence data, the definition of medically relevant protein targets has been improved by a combination of: (i) differential genome analysis (target list); and (ii) analysis of individual proteins (target analysis). Fast sequence comparisons, data mining, and genetic algorithms further promote these procedures. <em>Mycobacterium</em> <em>tuberculosis</em> proteins were chosen as applied examples.</p></div>","PeriodicalId":79331,"journal":{"name":"Computers & chemistry","volume":"26 1","pages":"Pages 15-21"},"PeriodicalIF":0.0,"publicationDate":"2001-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/S0097-8485(01)00095-X","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72502091","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Drug design by machine learning: support vector machines for pharmaceutical data analysis 基于机器学习的药物设计:支持向量机用于药物数据分析
Pub Date : 2001-12-01 DOI: 10.1016/S0097-8485(01)00094-8
R. Burbidge, M. Trotter, B. Buxton, S. Holden

We show that the support vector machine (SVM) classification algorithm, a recent development from the machine learning community, proves its potential for structure–activity relationship analysis. In a benchmark test, the SVM is compared to several machine learning techniques currently used in the field. The classification task involves predicting the inhibition of dihydrofolate reductase by pyrimidines, using data obtained from the UCI machine learning repository. Three artificial neural networks, a radial basis function network, and a C5.0 decision tree are all outperformed by the SVM. The SVM is significantly better than all of these, bar a manually capacity-controlled neural network, which takes considerably longer to train.

我们展示了支持向量机(SVM)分类算法,这是机器学习社区的最新发展,证明了它在结构-活动关系分析方面的潜力。在基准测试中,将支持向量机与该领域目前使用的几种机器学习技术进行了比较。分类任务包括预测嘧啶对二氢叶酸还原酶的抑制作用,使用从UCI机器学习存储库获得的数据。三种人工神经网络、一种径向基函数网络和一种C5.0决策树均优于支持向量机。支持向量机明显优于所有这些,除了人工控制容量的神经网络,这需要相当长的时间来训练。
{"title":"Drug design by machine learning: support vector machines for pharmaceutical data analysis","authors":"R. Burbidge,&nbsp;M. Trotter,&nbsp;B. Buxton,&nbsp;S. Holden","doi":"10.1016/S0097-8485(01)00094-8","DOIUrl":"10.1016/S0097-8485(01)00094-8","url":null,"abstract":"<div><p>We show that the support vector machine (SVM) classification algorithm, a recent development from the machine learning community, proves its potential for structure–activity relationship analysis. In a benchmark test, the SVM is compared to several machine learning techniques currently used in the field. The classification task involves predicting the inhibition of dihydrofolate reductase by pyrimidines, using data obtained from the UCI machine learning repository. Three artificial neural networks, a radial basis function network, and a C5.0 decision tree are all outperformed by the SVM. The SVM is significantly better than all of these, bar a manually capacity-controlled neural network, which takes considerably longer to train.</p></div>","PeriodicalId":79331,"journal":{"name":"Computers & chemistry","volume":"26 1","pages":"Pages 5-14"},"PeriodicalIF":0.0,"publicationDate":"2001-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/S0097-8485(01)00094-8","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88063480","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 620
期刊
Computers & chemistry
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1