{"title":"Statistical analysis of proteins families: a network and random matrix approach","authors":"Rakhi Kumari, Pradeep Bhadola, Nivedita Deo","doi":"10.1140/epjb/s10051-024-00781-6","DOIUrl":null,"url":null,"abstract":"<p>We present a novel method for analyzing the structural organization of protein families by integrating random matrix theory (RMT) and network theory with the physiochemical properties of amino acids and multiple sequence alignment. RMT distinguishes significant interactions between amino acids from background noise, pinpointing coevolving positions likely crucial for protein structure and function. This property-based approach captures both short and long-range correlations, unlike previous methods that treat amino acids as mere characters. The eigenvector components of eigenvalues outside the RMT bound deviate from typical RMT observations, offering critical system information. We quantify the information content of each eigenvector using an entropic estimate, showing that the smallest eigenvectors are highly localized and informative. These eigenvectors form clusters of biologically and structurally significant positions, validated by experiments. By creating networks of amino acid interactions for each property, we uncover key motifs and interactions. This method enhances our understanding of protein evolution, interactions, and potential targets to modulate enzymatic actions. We study two protein families Cadherin-4 and Betalactamase families which display two extreme characteristics one nearly random and the other very structured or organised.</p>","PeriodicalId":787,"journal":{"name":"The European Physical Journal B","volume":"97 10","pages":""},"PeriodicalIF":1.6000,"publicationDate":"2024-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"The European Physical Journal B","FirstCategoryId":"4","ListUrlMain":"https://link.springer.com/article/10.1140/epjb/s10051-024-00781-6","RegionNum":4,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"PHYSICS, CONDENSED MATTER","Score":null,"Total":0}
引用次数: 0
Abstract
We present a novel method for analyzing the structural organization of protein families by integrating random matrix theory (RMT) and network theory with the physiochemical properties of amino acids and multiple sequence alignment. RMT distinguishes significant interactions between amino acids from background noise, pinpointing coevolving positions likely crucial for protein structure and function. This property-based approach captures both short and long-range correlations, unlike previous methods that treat amino acids as mere characters. The eigenvector components of eigenvalues outside the RMT bound deviate from typical RMT observations, offering critical system information. We quantify the information content of each eigenvector using an entropic estimate, showing that the smallest eigenvectors are highly localized and informative. These eigenvectors form clusters of biologically and structurally significant positions, validated by experiments. By creating networks of amino acid interactions for each property, we uncover key motifs and interactions. This method enhances our understanding of protein evolution, interactions, and potential targets to modulate enzymatic actions. We study two protein families Cadherin-4 and Betalactamase families which display two extreme characteristics one nearly random and the other very structured or organised.