Denis Znamenskiy, Jacques Chomilier, Khan Le Tuan, Jean-Paul Mornon
{"title":"A new protein folding algorithm based on hydrophobic compactness: Rigid Unconnected Secondary Structure Iterative Assembly (RUSSIA). I: Methodology.","authors":"Denis Znamenskiy, Jacques Chomilier, Khan Le Tuan, Jean-Paul Mornon","doi":"10.1093/protein/gzg140","DOIUrl":null,"url":null,"abstract":"<p><p>We present an algorithm that is able to propose compact models of protein 3D structures, only starting from the prediction of the nature and length of regular secondary structures. Helices are modeled by cylinders and sheets by helicoid surfaces, all strands of a sheet being considered as a single block. It means that relative topology of the strands inside one sheet is a prerequisite. Loops are only considered as constraints, given by the maximal distance between their Calpha extremities according to their sequence length. Unconnected regular secondary structures are reduced to a single point, the center of their hydrophobic faces. These centers are then repeatedly moved in order to obtain a compact hydrophobic core. To prevent secondary structures from interpenetrating, a repulsive term is introduced in the function whose minimization leads to the compact structure. This RUSSIA (Rigid Unconnected Secondary Structure Assembly) algorithm has the advantage of relying on a small number of variables and therefore many initial conformations can be tested. Flexibility is produced in the following way: helices or sheets are allowed to rotate around the direction leading to the center of the model; residues in a sheet can slide along the main direction of the strand where they are embedded. RUSSIA is fast and simple and it produces on a test set several neighbor good models with an r.m.s. to the native structures in the range 1.4-3.7 A. These models can be further treated by statistical potentials used in threading approaches in order to detect the best candidate. The limits of the present method are the following: small proteins with few secondary structures are excluded; multi domain proteins must be split into several compact globular domains from their sequences; sheets of more than five strands and completely buried helices are not treated. In this first paper the algorithm is developed and in Part II, which follows, some applications are presented and the program is evaluated.</p>","PeriodicalId":20902,"journal":{"name":"Protein engineering","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2003-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1093/protein/gzg140","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Protein engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/protein/gzg140","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
We present an algorithm that is able to propose compact models of protein 3D structures, only starting from the prediction of the nature and length of regular secondary structures. Helices are modeled by cylinders and sheets by helicoid surfaces, all strands of a sheet being considered as a single block. It means that relative topology of the strands inside one sheet is a prerequisite. Loops are only considered as constraints, given by the maximal distance between their Calpha extremities according to their sequence length. Unconnected regular secondary structures are reduced to a single point, the center of their hydrophobic faces. These centers are then repeatedly moved in order to obtain a compact hydrophobic core. To prevent secondary structures from interpenetrating, a repulsive term is introduced in the function whose minimization leads to the compact structure. This RUSSIA (Rigid Unconnected Secondary Structure Assembly) algorithm has the advantage of relying on a small number of variables and therefore many initial conformations can be tested. Flexibility is produced in the following way: helices or sheets are allowed to rotate around the direction leading to the center of the model; residues in a sheet can slide along the main direction of the strand where they are embedded. RUSSIA is fast and simple and it produces on a test set several neighbor good models with an r.m.s. to the native structures in the range 1.4-3.7 A. These models can be further treated by statistical potentials used in threading approaches in order to detect the best candidate. The limits of the present method are the following: small proteins with few secondary structures are excluded; multi domain proteins must be split into several compact globular domains from their sequences; sheets of more than five strands and completely buried helices are not treated. In this first paper the algorithm is developed and in Part II, which follows, some applications are presented and the program is evaluated.