Computers & chemistry最新文献

New molecular surface-based 3D-QSAR method using Kohonen neural network and 3-way PLS 基于Kohonen神经网络和三向PLS的分子表面3D-QSAR新方法

Computers & chemistry

Pub Date : 2002-11-01 DOI: 10.1016/S0097-8485(02)00023-2

Kiyoshi Hasegawa , Shigeo Matsuoka , Masamoto Arakawa , Kimito Funatsu

Comparative molecular field analysis (CoMFA) has been widely used as a standard three dimensional quantitative structure–activity relationship (3D-QSAR) method. Although CoMFA is a useful technique, it does not always reflect real ligand–receptor interaction. Molecular interactions between the ligand and receptor are mainly occurred near the van der Waals surface of ligand. All grid points surrounding whole molecule in CoMFA are not important as molecular descriptors. If each molecule is represented by physico-chemical parameters on molecular surface, more precise and realistic 3D-QSAR is possible. We developed a new surface-based 3D-QSAR method using Kohonen neural network (KNN) and three-way partial least squares (3-way PLS). This method was applied to 25 dopamine 2 (D2) receptor antagonists for validation. First, the 3D coordinates of all sampling points on the van der Waals surface were projected into the 2D map by KNN. Each node in the map was coded by the associated molecular electrostatic potential (MEP) value of the original sampling point. Then, the correlation between the MEP values of all 2D maps and D2 receptor antagonist activities was analyzed by 3-way PLS. The statistics of the 3-way PLS model was excellent and the coefficients back-projected on the van der Waals surface had reasonable 3D distribution. Lastly, all data was divided into the calibration and validation sets by D-optimal designs and the activities of validation set were predicted. The external validation suggested that 3-way PLS is better than standard (2-way) PLS for prediction.

比较分子场分析(CoMFA)作为一种标准的三维定量构效关系(3D-QSAR)方法得到了广泛的应用。虽然CoMFA是一种有用的技术，但它并不总是反映真实的配体-受体相互作用。配体与受体之间的分子相互作用主要发生在配体的范德华表面附近。在CoMFA中，围绕整个分子的所有网格点作为分子描述符并不重要。如果用分子表面的物理化学参数来表示每个分子，就有可能实现更精确、更真实的3D-QSAR。我们利用Kohonen神经网络(KNN)和三向偏最小二乘(3-way PLS)开发了一种新的基于表面的3D-QSAR方法。该方法应用于25种多巴胺2 (D2)受体拮抗剂进行验证。首先，通过KNN将范德华表面上所有采样点的三维坐标投影到二维地图中。图中的每个节点由原始采样点的相关分子静电势(MEP)值编码。利用3-way PLS分析各2D图的MEP值与D2受体拮抗剂活性之间的相关性，结果表明，3-way PLS模型的统计性良好，在van der Waals表面上反投影的系数具有合理的3D分布。最后，采用d -最优设计将所有数据划分为校准集和验证集，并对验证集的活性进行预测。外部验证表明，3-way PLS的预测效果优于标准(2-way) PLS。

{"title":"New molecular surface-based 3D-QSAR method using Kohonen neural network and 3-way PLS","authors":"Kiyoshi Hasegawa , Shigeo Matsuoka , Masamoto Arakawa , Kimito Funatsu","doi":"10.1016/S0097-8485(02)00023-2","DOIUrl":"10.1016/S0097-8485(02)00023-2","url":null,"abstract":"<div><p>Comparative molecular field analysis (CoMFA) has been widely used as a standard three dimensional quantitative structure–activity relationship (3D-QSAR) method. Although CoMFA is a useful technique, it does not always reflect real ligand–receptor interaction. Molecular interactions between the ligand and receptor are mainly occurred near the van der Waals surface of ligand. All grid points surrounding whole molecule in CoMFA are not important as molecular descriptors. If each molecule is represented by physico-chemical parameters on molecular surface, more precise and realistic 3D-QSAR is possible. We developed a new surface-based 3D-QSAR method using Kohonen neural network (KNN) and three-way partial least squares (3-way PLS). This method was applied to 25 dopamine 2 (D2) receptor antagonists for validation. First, the 3D coordinates of all sampling points on the van der Waals surface were projected into the 2D map by KNN. Each node in the map was coded by the associated molecular electrostatic potential (MEP) value of the original sampling point. Then, the correlation between the MEP values of all 2D maps and D2 receptor antagonist activities was analyzed by 3-way PLS. The statistics of the 3-way PLS model was excellent and the coefficients back-projected on the van der Waals surface had reasonable 3D distribution. Lastly, all data was divided into the calibration and validation sets by D-optimal designs and the activities of validation set were predicted. The external validation suggested that 3-way PLS is better than standard (2-way) PLS for prediction.</p></div>","PeriodicalId":79331,"journal":{"name":"Computers & chemistry","volume":"26 6","pages":"Pages 583-589"},"PeriodicalIF":0.0,"publicationDate":"2002-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/S0097-8485(02)00023-2","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"22070721","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 30

Automatic identification by 13C NMR of substituent groups bonded in natural product skeletons 天然产物骨架中取代基的13C核磁共振自动识别

Computers & chemistry

Pub Date : 2002-11-01 DOI: 10.1016/S0097-8485(02)00029-3

Marcelo J.P Ferreira , Francimeiry C Oliveira , Sandra A.V Alvarenga , Patrı́cia A.T Macari , Gilberto V Rodrigues , Vicente P Emerenciano

The aim of this paper is to present a procedure that utilizes ¹³C NMR for identification of substituent groups which are bonded to carbon skeletons of natural products. For so much was developed a new version of the program macrono, that presents a database with 161 substituent types found in the most varied terpenoids. This new version was widely tested in the identification of the substituents of 60 compounds that, after removal of the signals that did not belong to the carbon skeleton, served to test the prediction of skeletons by using other programs of the expert system SISTEMAT.

本文的目的是提出一种程序，利用13C核磁共振识别取代基，这是连接到天然产物的碳骨架。为此，开发了一个新版本的macro程序，它提供了一个数据库，其中包含了在最不同的萜类化合物中发现的161种取代基类型。这个新版本在识别60种化合物的取代基中得到了广泛的测试，这些化合物在去除不属于碳骨架的信号后，用于使用专家系统SISTEMAT的其他程序测试骨架的预测。

引用次数: 16

Use of the Numerov method to improve the accuracy of the spatial discretisation in finite-difference electrochemical kinetic simulations 利用数值方法提高有限差分电化学动力学模拟中空间离散化的精度

Computers & chemistry

Pub Date : 2002-11-01 DOI: 10.1016/S0097-8485(02)00039-6

L.K. Bieniasz

The fourth order accuracy of the spatial discretisation of time-dependent reaction–diffusion equations, in finite-difference electrochemical kinetic simulations in one space dimension, might well be achieved by means of the three-point Numerov method, instead of the 5(6)-point discretisation of second spatial derivatives, recently suggested in the literature. This is proven theoretically, and tested in simulations of potential-step chronoamperometric and current-step chronopotentiometric transients for the Reinert–Berg system, which is a classical example of electrochemical reaction–diffusion equations. Although less generally applicable than the 5(6)-point spatial scheme, the Numerov discretisation is easier to use, because it does not lead to increased linear equation matrix bandwidth, but results in quasi-block-tridiagonal matrices, similar to those for the conventional, second order accurate, three-point spatial discretisation. The simulations reveal that the Numerov method brings an improvement of accuracy and efficiency that is comparable with the one offered by the 5(6)-point spatial scheme.

在一维空间有限差分电化学动力学模拟中，依赖时间的反应扩散方程的空间离散化的四阶精度可以通过三点Numerov方法来实现，而不是最近文献中提出的二阶空间导数的5(6)点离散化。这在理论上得到了证明，并在Reinert-Berg体系(电化学反应扩散方程的一个经典例子)的电位-步长计时电势和电流-步长计时电势瞬态模拟中得到了验证。虽然不像5(6)点空间方案那样普遍适用，但Numerov离散化更容易使用，因为它不会导致线性方程矩阵带宽的增加，而是产生准块三对角矩阵，类似于传统的二阶精确三点空间离散化。仿真结果表明，该方法的精度和效率与5(6)点空间格式相当。

{"title":"Use of the Numerov method to improve the accuracy of the spatial discretisation in finite-difference electrochemical kinetic simulations","authors":"L.K. Bieniasz","doi":"10.1016/S0097-8485(02)00039-6","DOIUrl":"10.1016/S0097-8485(02)00039-6","url":null,"abstract":"<div><p>The fourth order accuracy of the spatial discretisation of time-dependent reaction–diffusion equations, in finite-difference electrochemical kinetic simulations in one space dimension, might well be achieved by means of the three-point Numerov method, instead of the 5(6)-point discretisation of second spatial derivatives, recently suggested in the literature. This is proven theoretically, and tested in simulations of potential-step chronoamperometric and current-step chronopotentiometric transients for the Reinert–Berg system, which is a classical example of electrochemical reaction–diffusion equations. Although less generally applicable than the 5(6)-point spatial scheme, the Numerov discretisation is easier to use, because it does not lead to increased linear equation matrix bandwidth, but results in quasi-block-tridiagonal matrices, similar to those for the conventional, second order accurate, three-point spatial discretisation. The simulations reveal that the Numerov method brings an improvement of accuracy and efficiency that is comparable with the one offered by the 5(6)-point spatial scheme.</p></div>","PeriodicalId":79331,"journal":{"name":"Computers & chemistry","volume":"26 6","pages":"Pages 633-644"},"PeriodicalIF":0.0,"publicationDate":"2002-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/S0097-8485(02)00039-6","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"22070724","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 18

Application of a Kohonen neural network to the analysis of data regarding the alkylation of toluene with methanol catalyzed by ZSM-5 type zeolites 应用Kohonen神经网络对ZSM-5型沸石催化甲苯与甲醇烷基化反应数据进行分析

Computers & chemistry

Pub Date : 2002-11-01 DOI: 10.1016/S0097-8485(02)00020-7

J Petit , J Zupan , L Leherte , D.P Vercauteren

para-Xylene is widely used in chemical industry. It can be synthesized by alkylation of toluene with methanol using zeolite ZSM-5 as catalyst. The proportion of para-xylene, among its other isomers and other reaction byproducts, depends on the reaction conditions. As this process still remains largely empirical, we attempted to build a theoretical model able to predict the para-xylene yield under specific reaction conditions. We have consequently collected data regarding this reaction from the literature and exploited the potency of a particular artificial neural network (ANN), the counter-propagation ANN based on the Kohonen technique. The results show that such an approach is suitable to establish a predictive model of the yield in para-xylene on the basis of reaction parameters. The quality of the model could be further improved by considering a larger valuable data set, e.g. including experiments characterized by a low yield in para-xylene.

对二甲苯在化学工业中有着广泛的应用。以ZSM-5沸石为催化剂，甲苯与甲醇进行烷基化反应。对二甲苯在其他异构体和其他反应副产物中的比例取决于反应条件。由于这一过程在很大程度上仍然是经验的，我们试图建立一个能够预测特定反应条件下对二甲苯产率的理论模型。因此，我们从文献中收集了有关该反应的数据，并利用了特定人工神经网络(ANN)的潜力，即基于Kohonen技术的反传播ANN。结果表明，该方法适用于建立基于反应参数的对二甲苯产率预测模型。通过考虑更大的有价值的数据集，例如包括以对二甲苯产率低为特征的实验，可以进一步提高模型的质量。

引用次数: 4

On the solution of mixed-integer nonlinear programming models for computer aided molecular design 计算机辅助分子设计中混合整数非线性规划模型的求解

Computers & chemistry

Pub Date : 2002-11-01 DOI: 10.1016/S0097-8485(02)00049-9

Guennadi M. Ostrovsky, Luke E.K. Achenie, Manish Sinha

This paper addresses the efficient solution of computer aided molecular design (CAMD) problems, which have been posed as mixed-integer nonlinear programming models. The models of interest are those in which the number of linear constraints far exceeds the number of nonlinear constraints, and with most variables participating in the nonconvex terms. As a result global optimization methods are needed. A branch-and-bound algorithm (BB) is proposed that is specifically tailored to solving such problems. In a conventional BB algorithm, branching is performed on all the search variables that appear in the nonlinear terms. This translates to a large number of node traversals. To overcome this problem, we have proposed a new strategy for branching on a set of linear branching functions, which depend linearly on the search variables. This leads to a significant reduction in the dimensionality of the search space. The construction of linear underestimators for a class of functions is also presented. The CAMD problem that is considered is the design of optimal solvents to be used as cleaning agents in lithographic printing.

本文研究了以混合整数非线性规划模型提出的计算机辅助分子设计问题的有效求解方法。感兴趣的模型是那些线性约束的数量远远超过非线性约束的数量，并且大多数变量参与非凸项的模型。因此，需要全局优化方法。针对这类问题，提出了一种分支定界算法(BB)。在传统的BB算法中，对出现在非线性项中的所有搜索变量执行分支。这意味着需要进行大量的节点遍历。为了克服这一问题，我们提出了一种新的分支策略，该策略在一组线性分支函数上进行分支，这些分支函数线性依赖于搜索变量。这将导致搜索空间的维数显著降低。给出了一类函数的线性低估量的构造。所考虑的CAMD问题是在平版印刷中用作清洗剂的最佳溶剂的设计。

{"title":"On the solution of mixed-integer nonlinear programming models for computer aided molecular design","authors":"Guennadi M. Ostrovsky, Luke E.K. Achenie, Manish Sinha","doi":"10.1016/S0097-8485(02)00049-9","DOIUrl":"10.1016/S0097-8485(02)00049-9","url":null,"abstract":"<div><p>This paper addresses the efficient solution of computer aided molecular design (CAMD) problems, which have been posed as mixed-integer nonlinear programming models. The models of interest are those in which the number of linear constraints far exceeds the number of nonlinear constraints, and with most variables participating in the nonconvex terms. As a result global optimization methods are needed. A branch-and-bound algorithm (BB) is proposed that is specifically tailored to solving such problems. In a conventional BB algorithm, branching is performed on all the search variables that appear in the nonlinear terms. This translates to a large number of node traversals. To overcome this problem, we have proposed a new strategy for branching on a set of linear <em>branching functions</em>, which depend linearly on the search variables. This leads to a significant reduction in the dimensionality of the search space. The construction of linear underestimators for a class of functions is also presented. The CAMD problem that is considered is the design of optimal solvents to be used as cleaning agents in lithographic printing.</p></div>","PeriodicalId":79331,"journal":{"name":"Computers & chemistry","volume":"26 6","pages":"Pages 645-660"},"PeriodicalIF":0.0,"publicationDate":"2002-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/S0097-8485(02)00049-9","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"22070725","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 28

Fuzzy logic model of Langmuir probe discharge data Langmuir探针放电数据的模糊逻辑模型

Computers & chemistry

Pub Date : 2002-11-01 DOI: 10.1016/S0097-8485(02)00021-9

Byungwhan Kim , Jang Hyun Park , Beom-Soo Kim

Plasma models are crucial to gain physical insights into complex discharges as well as to optimizing plasma-driven processes. As an alternative to physical model, a qualitative model was constructed using adaptive fuzzy logic called adaptive network fuzzy inference system (ANFIS). Prediction performance of ANFIS was evaluated on two sets of experimental discharge data. One referred to as hemispherical inductively coupled plasma (HICP) was characterized with a 2⁴ full factorial experiment, in which the factors that were varied include source power, pressure, chuck position, and Cl₂ flow rate. The other called multipole ICP was characterized by performing a 3³ full factorial experiment on the factors, including source power, pressure, and Ar flow rate. Trained ANFIS models were tested on eight and 16 experiments not pertaining to previous training data for HICP and MICP, respectively. Plasma attributes modeled include electron density, electron temperature, and plasma potential. The performance of ANFIS was optimized as a function of a type of membership function, number of membership function, and two learning factors. The number of membership functions was different depending on the type of plasma data and employing too large number of membership functions resulted in a drastic degradation in prediction performances. Optimized ANFIS models were compared to statistical regression models and demonstrated improved predictions in all comparisons.

等离子体模型对于获得复杂放电的物理见解以及优化等离子体驱动过程至关重要。作为物理模型的替代，利用自适应模糊逻辑构建了一个定性模型，称为自适应网络模糊推理系统(ANFIS)。利用两组实验数据对ANFIS的预测性能进行了评价。其中一种称为半球形电感耦合等离子体(HICP)，通过24全因子实验对其进行了表征，其中包括源功率，压力，卡盘位置和Cl2流量。另一种称为多极ICP，通过对包括源功率，压力和Ar流量在内的因素进行33全因子实验来表征。训练后的ANFIS模型分别在与HICP和MICP之前的训练数据无关的8个和16个实验上进行测试。建模的等离子体属性包括电子密度、电子温度和等离子体势。该算法以隶属函数类型、隶属函数个数和两个学习因子为函数进行性能优化。隶属函数的数量随等离子体数据类型的不同而不同，使用过多的隶属函数会导致预测性能的急剧下降。将优化的ANFIS模型与统计回归模型进行比较，并在所有比较中证明了改进的预测。

{"title":"Fuzzy logic model of Langmuir probe discharge data","authors":"Byungwhan Kim , Jang Hyun Park , Beom-Soo Kim","doi":"10.1016/S0097-8485(02)00021-9","DOIUrl":"10.1016/S0097-8485(02)00021-9","url":null,"abstract":"<div><p>Plasma models are crucial to gain physical insights into complex discharges as well as to optimizing plasma-driven processes. As an alternative to physical model, a qualitative model was constructed using adaptive fuzzy logic called adaptive network fuzzy inference system (ANFIS). Prediction performance of ANFIS was evaluated on two sets of experimental discharge data. One referred to as hemispherical inductively coupled plasma (HICP) was characterized with a 2<sup>4</sup> full factorial experiment, in which the factors that were varied include source power, pressure, chuck position, and Cl<sub>2</sub> flow rate. The other called multipole ICP was characterized by performing a 3<sup>3</sup> full factorial experiment on the factors, including source power, pressure, and Ar flow rate. Trained ANFIS models were tested on eight and 16 experiments not pertaining to previous training data for HICP and MICP, respectively. Plasma attributes modeled include electron density, electron temperature, and plasma potential. The performance of ANFIS was optimized as a function of a type of membership function, number of membership function, and two learning factors. The number of membership functions was different depending on the type of plasma data and employing too large number of membership functions resulted in a drastic degradation in prediction performances. Optimized ANFIS models were compared to statistical regression models and demonstrated improved predictions in all comparisons.</p></div>","PeriodicalId":79331,"journal":{"name":"Computers & chemistry","volume":"26 6","pages":"Pages 573-581"},"PeriodicalIF":0.0,"publicationDate":"2002-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/S0097-8485(02)00021-9","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"22070804","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 16

Constructing a useful tool for characterizing amino acid conformers by means of quantum chemical and graph theory indices 利用量子化学和图论指标，构建了一个有用的表征氨基酸构象的工具

Computers & chemistry

Pub Date : 2002-11-01 DOI: 10.1016/S0097-8485(02)00052-9

Constanza Cárdenas , Mateo Obregón , Eugenio-José Llanos , Eduardo Machado , Hugo-Javier Bohórquez , Jose-Luis Villaveces , Manuel-Elkin Patarroyo

The aim of this work is to construct a tool to assist in the prediction of peptidic properties resulting from the exchange of two amino acids in a proteic chain. In the past others have used experimental properties for this purpose. However, the nature of these data sets severely limits their access to important properties pertaining to secondary structure, and hence the indices used cannot characterize different backbone conformers like α helix and β strands, or side-chain conformations like gauche+, gauche− and trans. In this study we explore the importance of backbone and side-chain angles with regard to conformer similarity measured with theoretical properties calculated in an ab initio manner. For each of the 20 genetically encoded amino acids, we studied five conformers that correspond to α helical and β strand structures, with three different side chain conformations for each, defined solely by their angles Φ, Ψ and χ₁. This methodology allowed each of the 108 conformers to be represented by a mathematical object without ambiguity. The peptidic chain was emulated using two capping models to simulate the effect of nearest neighbors. These are OHCX_aaNH₂ and AlaX_aaAla, where X_aa is the conformer of interest. We then calculated 40 ab initio quantum chemical and graph theory indices for each backbone-side-chain conformer to obtain a characterization and classification scheme. We found that: (1) while backbone structure is very important to conformer similarity, side-chain conformations do not cluster together in a top-level manner; (2) amino acids with π electrons group together independent of backbone conformation.

这项工作的目的是构建一种工具，以协助预测由蛋白质链中两个氨基酸交换产生的肽性质。在过去，有人利用实验性质来达到这个目的。然而，这些数据集的性质严重限制了它们对二级结构的重要性质的获取，因此所使用的指数不能表征不同的主链构象，如α螺旋和β链，或侧链构象，如间扭式+，间扭式-和反式。在这项研究中，我们探讨了主链角和侧链角的重要性，考虑到以从头算的方式计算的理论性质测量的构象相似性。对于20种遗传编码氨基酸中的每一种，我们研究了对应于α螺旋和β链结构的5种构象，每种构象具有3种不同的侧链构象，仅由它们的角度Φ， Ψ和χ1定义。这种方法允许108个构象中的每一个都用一个数学对象来表示，没有歧义。采用两种旋盖模型对肽链进行了模拟，以模拟近邻效应。它们是OHCXaaNH2和AlaXaaAla，其中Xaa是我们感兴趣的构象。然后，我们计算了40个从头算量子化学和图论指标为每个骨干侧链构象，以获得表征和分类方案。研究发现:(1)虽然主链结构对构象相似度有重要影响，但侧链构象并不以顶层方式聚集在一起;(2)具有π电子的氨基酸不依赖于主链构象而聚集在一起。

{"title":"Constructing a useful tool for characterizing amino acid conformers by means of quantum chemical and graph theory indices","authors":"Constanza Cárdenas , Mateo Obregón , Eugenio-José Llanos , Eduardo Machado , Hugo-Javier Bohórquez , Jose-Luis Villaveces , Manuel-Elkin Patarroyo","doi":"10.1016/S0097-8485(02)00052-9","DOIUrl":"10.1016/S0097-8485(02)00052-9","url":null,"abstract":"<div><p>The aim of this work is to construct a tool to assist in the prediction of peptidic properties resulting from the exchange of two amino acids in a proteic chain. In the past others have used experimental properties for this purpose. However, the nature of these data sets severely limits their access to important properties pertaining to secondary structure, and hence the indices used cannot characterize different backbone conformers like α helix and β strands, or side-chain conformations like <em>gauche</em>+, <em>gauche</em>− and <em>trans</em>. In this study we explore the importance of backbone and side-chain angles with regard to conformer similarity measured with theoretical properties calculated in an ab initio manner. For each of the 20 genetically encoded amino acids, we studied five conformers that correspond to α helical and β strand structures, with three different side chain conformations for each, defined solely by their angles <em>Φ</em>, <em>Ψ</em> and <em>χ</em><sub>1</sub>. This methodology allowed each of the 108 conformers to be represented by a mathematical object without ambiguity. The peptidic chain was emulated using two capping models to simulate the effect of nearest neighbors. These are OHCX<sub>aa</sub>NH<sub>2</sub> and AlaX<sub>aa</sub>Ala, where X<sub>aa</sub> is the conformer of interest. We then calculated 40 ab initio quantum chemical and graph theory indices for each backbone-side-chain conformer to obtain a characterization and classification scheme. We found that: (1) while backbone structure is very important to conformer similarity, side-chain conformations do not cluster together in a top-level manner; (2) amino acids with π electrons group together independent of backbone conformation.</p></div>","PeriodicalId":79331,"journal":{"name":"Computers & chemistry","volume":"26 6","pages":"Pages 667-682"},"PeriodicalIF":0.0,"publicationDate":"2002-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/S0097-8485(02)00052-9","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"22070727","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

CLiBE: a database of computed ligand binding energy for ligand–receptor complexes CLiBE:计算配体-受体复合物的配体结合能的数据库

Computers & chemistry

Pub Date : 2002-11-01 DOI: 10.1016/S0097-8485(02)00050-5

X. Chen, Z.L. Ji, D.G. Zhi, Y.Z. Chen

Consideration of binding competitiveness of a drug candidate against natural ligands and other drugs that bind to the same receptor site may facilitate the rational development of a candidate into a potent drug. A strategy that can be applied to computer-aided drug design is to evaluate ligand–receptor interaction energy or other scoring functions of a designed drug with that of the relevant ligands known to bind to the same binding site. As a tool to facilitate such a strategy, a database of ligand–receptor interaction energy is developed from known ligand–receptor 3D structural entries in the Protein Databank (PDB). The Energy is computed based on a molecular mechanics force field that has been used in the prediction of therapeutic and toxicity targets of drugs. This database also contains information about ligand function and other properties and it can be accessed at http://xin.cz3.nus.edu.sg/group/CLiBE.asp. The computed energy components may facilitate the probing of the mode of action and other profiles of binding. A number of computed energies of some PDB ligand–receptor complexes in this database are studied and compared to experimental binding affinity. A certain degree of correlation between the computed energy and experimental binding affinity is found, which suggests that the computed energy may be useful in facilitating a qualitative analysis of drug binding competitiveness.

考虑候选药物与天然配体和其他与同一受体位点结合的药物的结合竞争力，可能有助于候选药物合理地发展为有效药物。一种可以应用于计算机辅助药物设计的策略是评估设计药物的配体-受体相互作用能或其他评分函数，以及已知与相同结合位点结合的相关配体的相互作用能。作为促进这种策略的工具，从蛋白质数据库(PDB)中已知的配体-受体三维结构条目中开发了一个配体-受体相互作用能数据库。能量是根据分子力学力场计算的，该力场已用于预测药物的治疗和毒性靶标。该数据库还包含有关配体功能和其他属性的信息，可以访问http://xin.cz3.nus.edu.sg/group/CLiBE.asp。所计算的能量分量可以方便地探测作用模式和其他结合的轮廓。研究了该数据库中一些PDB配体-受体复合物的计算能量，并与实验结合亲和力进行了比较。计算出的能量与实验的结合亲和力之间存在一定程度的相关性，这表明计算出的能量可能有助于促进药物结合竞争力的定性分析。

{"title":"CLiBE: a database of computed ligand binding energy for ligand–receptor complexes","authors":"X. Chen, Z.L. Ji, D.G. Zhi, Y.Z. Chen","doi":"10.1016/S0097-8485(02)00050-5","DOIUrl":"10.1016/S0097-8485(02)00050-5","url":null,"abstract":"<div><p>Consideration of binding competitiveness of a drug candidate against natural ligands and other drugs that bind to the same receptor site may facilitate the rational development of a candidate into a potent drug. A strategy that can be applied to computer-aided drug design is to evaluate ligand–receptor interaction energy or other scoring functions of a designed drug with that of the relevant ligands known to bind to the same binding site. As a tool to facilitate such a strategy, a database of ligand–receptor interaction energy is developed from known ligand–receptor 3D structural entries in the Protein Databank (PDB). The Energy is computed based on a molecular mechanics force field that has been used in the prediction of therapeutic and toxicity targets of drugs. This database also contains information about ligand function and other properties and it can be accessed at <span>http://xin.cz3.nus.edu.sg/group/CLiBE.asp</span><svg><path></path></svg>. The computed energy components may facilitate the probing of the mode of action and other profiles of binding. A number of computed energies of some PDB ligand–receptor complexes in this database are studied and compared to experimental binding affinity. A certain degree of correlation between the computed energy and experimental binding affinity is found, which suggests that the computed energy may be useful in facilitating a qualitative analysis of drug binding competitiveness.</p></div>","PeriodicalId":79331,"journal":{"name":"Computers & chemistry","volume":"26 6","pages":"Pages 661-666"},"PeriodicalIF":0.0,"publicationDate":"2002-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/S0097-8485(02)00050-5","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"22070726","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 17

A new redundant variable pruning approach—minor latent variable perturbation–PLS used for QSAR studies on anti-HIV drugs 一种新的冗余变量修剪方法-微小潜在变量扰动- pls用于抗hiv药物的QSAR研究

Computers & chemistry

Pub Date : 2002-11-01 DOI: 10.1016/S0097-8485(02)00022-0

Hong-Ping Xie , Jian-Hui Jiang , Hui Cui , Guo-Li Shen , Ru-Qin Yu

A new approach for eliminating the redundant variables in the multivariable data matrix encountered in QSAR studies, minor latent variable perturbation (MLVP)-PLS method has been proposed. In the latent variable (LV) space, the minor latent variables (LVs) with small covariances are mainly formulated by linear combinations of the redundant variables including information-deficient and highly correlative ones, while the major LVs with large covariances are mainly contributed by the informative variables. Deleting a minor LV, which is equivalent to a perturbation for LV space, could make the redundant variables not well be represented in LV subspace, leading to strong variation of their PLS regression coefficients. The informative variables could still be normally represented in LV subspace with the PLS regression coefficients remaining relatively stable. MLVP-PLS utilizes this fact to discriminate the informative and redundant variables. It gradually identifies and eliminates the redundant variables according to the relative variation of PLS regression coefficients after perturbations are given. The elimination process is terminated according to some proposed criteria. Applying the method to the quantitative structure–activity relationship (QSAR) studies on TIBO derivatives as potential anti-HIV drugs has demonstrated the feasibility and robustness of the proposed approach. A deeper insight into the effect of different structural parameters on the bio-activity of TIBO derivatives has been reached.

提出了一种消除QSAR研究中所遇到的多变量数据矩阵中冗余变量的新方法——小潜变量摄动(MLVP)-PLS方法。在潜变量空间中，协方差较小的次要潜变量主要由信息缺失和高度相关等冗余变量的线性组合构成，协方差较大的主要潜变量主要由信息丰富的变量构成。删除一个较小的LV相当于对LV空间的扰动，会使冗余变量在LV子空间中不能很好地表示，从而导致其PLS回归系数的强烈变化。信息变量仍然可以在LV子空间中正常表示，PLS回归系数保持相对稳定。MLVP-PLS利用这一事实来区分信息和冗余变量。在给定扰动后，根据PLS回归系数的相对变化，逐步识别和消除冗余变量。根据一些建议的标准终止淘汰过程。将该方法应用于TIBO衍生物作为潜在抗hiv药物的定量构效关系(QSAR)研究，证明了该方法的可行性和鲁棒性。对不同结构参数对TIBO衍生物生物活性的影响有了更深入的了解。

{"title":"A new redundant variable pruning approach—minor latent variable perturbation–PLS used for QSAR studies on anti-HIV drugs","authors":"Hong-Ping Xie , Jian-Hui Jiang , Hui Cui , Guo-Li Shen , Ru-Qin Yu","doi":"10.1016/S0097-8485(02)00022-0","DOIUrl":"10.1016/S0097-8485(02)00022-0","url":null,"abstract":"<div><p>A new approach for eliminating the redundant variables in the multivariable data matrix encountered in QSAR studies, minor latent variable perturbation (MLVP)-PLS method has been proposed. In the latent variable (LV) space, the minor latent variables (LVs<strong>)</strong> with small covariances are mainly formulated by linear combinations of the redundant variables including information-deficient and highly correlative ones, while the major LVs with large covariances are mainly contributed by the informative variables. Deleting a minor LV, which is equivalent to a perturbation for LV space, could make the redundant variables not well be represented in LV subspace, leading to strong variation of their PLS regression coefficients. The informative variables could still be normally represented in LV subspace with the PLS regression coefficients remaining relatively stable. MLVP-PLS utilizes this fact to discriminate the informative and redundant variables. It gradually identifies and eliminates the redundant variables according to the relative variation of PLS regression coefficients after perturbations are given. The elimination process is terminated according to some proposed criteria. Applying the method to the quantitative structure–activity relationship (QSAR) studies on TIBO derivatives as potential anti-HIV drugs has demonstrated the feasibility and robustness of the proposed approach. A deeper insight into the effect of different structural parameters on the bio-activity of TIBO derivatives has been reached.</p></div>","PeriodicalId":79331,"journal":{"name":"Computers & chemistry","volume":"26 6","pages":"Pages 591-600"},"PeriodicalIF":0.0,"publicationDate":"2002-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/S0097-8485(02)00022-0","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"22070722","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 12

Bias of purine stretches in sequenced chromosomes 嘌呤的偏倚延伸在测序染色体

Computers & chemistry

Pub Date : 2002-07-01 DOI: 10.1016/S0097-8485(02)00013-X

David Ussery, Dikeos Mario Soumpasis, Søren Brunak, Hans Henrik Stærfeldt, Peder Worning, Anders Krogh

We examined more than 700 DNA sequences (full length chromosomes and plasmids) for stretches of purines (R) or pyrimidines (Y) and alternating YR stretches; such regions will likely adopt structures which are different from the canonical B-form. Since one turn of the DNA helix is roughly 10 bp, we measured the fraction of each genome which contains purine (or pyrimidine) tracts of lengths of 10 bp or longer (hereafter referred to as ‘purine tracts’), as well as stretches of alternating pyrimidines/purine (‘pyr/pur tracts’) of the same length. Using this criteria, a random sequence would be expected to contain 1.0% of purine tracts and also 1.0% of the alternating pyr/pur tracts. In the vast majority of cases, there are more purine tracts than would be expected from a random sequence, with an average of 3.5%, significantly larger than the expectation value. The fraction of the chromosomes containing pyr/pur tracts was slightly less than expected, with an average of 0.8%. One of the most surprising findings is a clear difference in the length distributions of the regions studied between prokaryotes and eukaryotes. Whereas short-range correlations can explain the length distributions in prokaryotes, in eukaryotes there is an abundance of long stretches of purines or alternating purine/pyrimidine tracts, which cannot be explained in this way; these sequences are likely to play an important role in eukaryotic chromosome organisation.

我们检查了超过700个DNA序列(全长染色体和质粒)的嘌呤(R)或嘧啶(Y)延伸和交替的YR延伸;这些区域可能会采用不同于标准b型的结构。由于DNA螺旋的一个回合大约是10 bp，我们测量了每个基因组中含有长度为10 bp或更长(以下称为“嘌呤束”)的嘌呤(或嘧啶)束的比例，以及相同长度的嘧啶/嘌呤交替延伸(“pyr/pur束”)。使用这一标准，一个随机序列预计将包含1.0%的嘌呤束和1.0%的交替pyr/pur束。在绝大多数情况下，嘌呤束比随机序列所期望的要多，平均为3.5%，明显大于期望值。含有pyr/pur束的染色体比例略低于预期，平均为0.8%。最令人惊讶的发现之一是原核生物和真核生物研究区域的长度分布明显不同。虽然短程相关性可以解释原核生物的长度分布，但真核生物中存在大量的嘌呤长链或嘌呤/嘧啶交替链，这不能用这种方式解释;这些序列可能在真核生物染色体组织中起重要作用。

{"title":"Bias of purine stretches in sequenced chromosomes","authors":"David Ussery, Dikeos Mario Soumpasis, Søren Brunak, Hans Henrik Stærfeldt, Peder Worning, Anders Krogh","doi":"10.1016/S0097-8485(02)00013-X","DOIUrl":"10.1016/S0097-8485(02)00013-X","url":null,"abstract":"<div><p>We examined more than 700 DNA sequences (full length chromosomes and plasmids) for stretches of purines (R) or pyrimidines (Y) and alternating YR stretches; such regions will likely adopt structures which are different from the canonical B-form. Since one turn of the DNA helix is roughly 10 bp, we measured the fraction of each genome which contains purine (or pyrimidine) tracts of lengths of 10 bp or longer (hereafter referred to as ‘purine tracts’), as well as stretches of alternating pyrimidines/purine (‘pyr/pur tracts’) of the same length. Using this criteria, a random sequence would be expected to contain 1.0% of purine tracts and also 1.0% of the alternating pyr/pur tracts. In the vast majority of cases, there are more purine tracts than would be expected from a random sequence, with an average of 3.5%, significantly larger than the expectation value. The fraction of the chromosomes containing pyr/pur tracts was slightly less than expected, with an average of 0.8%. One of the most surprising findings is a clear difference in the length distributions of the regions studied between prokaryotes and eukaryotes. Whereas short-range correlations can explain the length distributions in prokaryotes, in eukaryotes there is an abundance of long stretches of purines or alternating purine/pyrimidine tracts, which cannot be explained in this way; these sequences are likely to play an important role in eukaryotic chromosome organisation.</p></div>","PeriodicalId":79331,"journal":{"name":"Computers & chemistry","volume":"26 5","pages":"Pages 531-541"},"PeriodicalIF":0.0,"publicationDate":"2002-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/S0097-8485(02)00013-X","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76313132","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 36