{"title":"Physicochemical Evaluation of Remote Homology in the Twilight Zone.","authors":"Jamie Dennis Dixson, Rajeev Kumar Azad","doi":"10.1002/prot.26742","DOIUrl":null,"url":null,"abstract":"<p><p>A fundamental problem in the field of protein evolutionary biology is determining the degree and nature of evolutionary relatedness among homologous proteins that have diverged to a point where they share less than 30% amino acid identity yet retain similar structures and/or functions. Such proteins are said to lie within the \"Twilight Zone\" of amino acid identity. Many researchers have leveraged experimentally determined structures in the quest to classify proteins in the Twilight Zone. Such endeavors can be highly time consuming and prohibitively expensive for large-scale analyses. Motivated by this problem, here we use molecular weight-hydrophobicity physicochemical dynamic time warping (MWHP DTW) to quantify similarity of simulated and real-world homologous protein domains. MWHP DTW is a physicochemical method requiring only the amino acid sequence to quantify similarity of related proteins and is particularly useful in determining similarity within the Twilight Zone due to its resilience to primary sequence substitution saturation. This is a step forward in determination of the relatedness among Twilight Zone proteins and most notably allows for the discrimination of random similarity and true homology in the 0%-20% identity range. This method was previously presented expeditiously just after the outbreak of COVID-19 because it was able to functionally cluster ACE2-binding betacoronavirus receptor binding domains (RBDs), a task that has been elusive using standard techniques. Here we show that one reason that MWHP DTW is an effective technique for comparisons within the Twilight Zone is because it can uncover hidden homology by exploiting physicochemical conservation, a problem that protein sequence alignment algorithms are inherently incapable of addressing within the Twilight Zone. Further, we present an extended definition of the Twilight Zone that incorporates the dynamic relationship between structural, physicochemical, and sequence-based metrics.</p>","PeriodicalId":56271,"journal":{"name":"Proteins-Structure Function and Bioinformatics","volume":" ","pages":"452-464"},"PeriodicalIF":3.2000,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proteins-Structure Function and Bioinformatics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1002/prot.26742","RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/9/1 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
A fundamental problem in the field of protein evolutionary biology is determining the degree and nature of evolutionary relatedness among homologous proteins that have diverged to a point where they share less than 30% amino acid identity yet retain similar structures and/or functions. Such proteins are said to lie within the "Twilight Zone" of amino acid identity. Many researchers have leveraged experimentally determined structures in the quest to classify proteins in the Twilight Zone. Such endeavors can be highly time consuming and prohibitively expensive for large-scale analyses. Motivated by this problem, here we use molecular weight-hydrophobicity physicochemical dynamic time warping (MWHP DTW) to quantify similarity of simulated and real-world homologous protein domains. MWHP DTW is a physicochemical method requiring only the amino acid sequence to quantify similarity of related proteins and is particularly useful in determining similarity within the Twilight Zone due to its resilience to primary sequence substitution saturation. This is a step forward in determination of the relatedness among Twilight Zone proteins and most notably allows for the discrimination of random similarity and true homology in the 0%-20% identity range. This method was previously presented expeditiously just after the outbreak of COVID-19 because it was able to functionally cluster ACE2-binding betacoronavirus receptor binding domains (RBDs), a task that has been elusive using standard techniques. Here we show that one reason that MWHP DTW is an effective technique for comparisons within the Twilight Zone is because it can uncover hidden homology by exploiting physicochemical conservation, a problem that protein sequence alignment algorithms are inherently incapable of addressing within the Twilight Zone. Further, we present an extended definition of the Twilight Zone that incorporates the dynamic relationship between structural, physicochemical, and sequence-based metrics.
期刊介绍:
PROTEINS : Structure, Function, and Bioinformatics publishes original reports of significant experimental and analytic research in all areas of protein research: structure, function, computation, genetics, and design. The journal encourages reports that present new experimental or computational approaches for interpreting and understanding data from biophysical chemistry, structural studies of proteins and macromolecular assemblies, alterations of protein structure and function engineered through techniques of molecular biology and genetics, functional analyses under physiologic conditions, as well as the interactions of proteins with receptors, nucleic acids, or other specific ligands or substrates. Research in protein and peptide biochemistry directed toward synthesizing or characterizing molecules that simulate aspects of the activity of proteins, or that act as inhibitors of protein function, is also within the scope of PROTEINS. In addition to full-length reports, short communications (usually not more than 4 printed pages) and prediction reports are welcome. Reviews are typically by invitation; authors are encouraged to submit proposed topics for consideration.