Pub Date : 2023-12-19DOI: 10.1007/s10930-023-10168-8
Abstract
Protein–protein interactions are crucial for the entry of viruses into the cell. Understanding the mechanism of interactions is essential in studying human-virus association, developing new biologics and drug candidates, as well as viral infections and antiviral responses. Experimental methods to analyze human-virus protein–protein interactions based on protein sequence data are time-consuming and labor-intensive, so machine learning models are being developed to predict interactions and determine large-scale interactomes between species. The present work highlights the importance of sequence features in classifying interacting and non-interacting proteins from the protein sequence data. Higher dimensional amino acid sequence features such as Amino Acid Composition (AAC), Dipeptide Composition (DPC), Grouped Amino Acid Composition (GAAC), Pseudo-Amino Acid Composition (PAAC) etc., are extracted. Following feature extraction, three datasets were created: Dataset 1 contains all of the extracted features. While Datasets 2 and 3 contain the most relevant features obtained through dimensionality reduction. To analyze the importance of high-dimensional features and their participation in protein–protein interactions, a random forest classifier is trained on three datasets. With dimensionality reduction, the model exhibited exceptional accuracy, indicating that dimensionality reduction fails to capture the complexity of interactions and the underlying relationships between human and viral proteins. As a result of retaining high-dimensional features, it is possible to capture all the characteristics of protein–protein interactions that resemble host–pathogen associations, leading to the development of biologically meaningful models. Our proposed approach is a more realistic and comprehensive classification model, leading to deeper insights and better applications in virology and drug development.
{"title":"Significance of Sequence Features in Classification of Protein–Protein Interactions Using Machine Learning","authors":"","doi":"10.1007/s10930-023-10168-8","DOIUrl":"https://doi.org/10.1007/s10930-023-10168-8","url":null,"abstract":"<h3>Abstract</h3> <p>Protein–protein interactions are crucial for the entry of viruses into the cell. Understanding the mechanism of interactions is essential in studying human-virus association, developing new biologics and drug candidates, as well as viral infections and antiviral responses. Experimental methods to analyze human-virus protein–protein interactions based on protein sequence data are time-consuming and labor-intensive, so machine learning models are being developed to predict interactions and determine large-scale interactomes between species. The present work highlights the importance of sequence features in classifying interacting and non-interacting proteins from the protein sequence data. Higher dimensional amino acid sequence features such as Amino Acid Composition (AAC), Dipeptide Composition (DPC), Grouped Amino Acid Composition (GAAC), Pseudo-Amino Acid Composition (PAAC) etc., are extracted. Following feature extraction, three datasets were created: Dataset 1 contains all of the extracted features. While Datasets 2 and 3 contain the most relevant features obtained through dimensionality reduction. To analyze the importance of high-dimensional features and their participation in protein–protein interactions, a random forest classifier is trained on three datasets. With dimensionality reduction, the model exhibited exceptional accuracy, indicating that dimensionality reduction fails to capture the complexity of interactions and the underlying relationships between human and viral proteins. As a result of retaining high-dimensional features, it is possible to capture all the characteristics of protein–protein interactions that resemble host–pathogen associations, leading to the development of biologically meaningful models. Our proposed approach is a more realistic and comprehensive classification model, leading to deeper insights and better applications in virology and drug development.</p>","PeriodicalId":793,"journal":{"name":"The Protein Journal","volume":"118 1","pages":""},"PeriodicalIF":3.0,"publicationDate":"2023-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138816585","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-12-19DOI: 10.1007/s10930-023-10168-8
Sini S. Raj, S. S. Vinod Chandra
Protein–protein interactions are crucial for the entry of viruses into the cell. Understanding the mechanism of interactions is essential in studying human-virus association, developing new biologics and drug candidates, as well as viral infections and antiviral responses. Experimental methods to analyze human-virus protein–protein interactions based on protein sequence data are time-consuming and labor-intensive, so machine learning models are being developed to predict interactions and determine large-scale interactomes between species. The present work highlights the importance of sequence features in classifying interacting and non-interacting proteins from the protein sequence data. Higher dimensional amino acid sequence features such as Amino Acid Composition (AAC), Dipeptide Composition (DPC), Grouped Amino Acid Composition (GAAC), Pseudo-Amino Acid Composition (PAAC) etc., are extracted. Following feature extraction, three datasets were created: Dataset 1 contains all of the extracted features. While Datasets 2 and 3 contain the most relevant features obtained through dimensionality reduction. To analyze the importance of high-dimensional features and their participation in protein–protein interactions, a random forest classifier is trained on three datasets. With dimensionality reduction, the model exhibited exceptional accuracy, indicating that dimensionality reduction fails to capture the complexity of interactions and the underlying relationships between human and viral proteins. As a result of retaining high-dimensional features, it is possible to capture all the characteristics of protein–protein interactions that resemble host–pathogen associations, leading to the development of biologically meaningful models. Our proposed approach is a more realistic and comprehensive classification model, leading to deeper insights and better applications in virology and drug development.
{"title":"Significance of Sequence Features in Classification of Protein–Protein Interactions Using Machine Learning","authors":"Sini S. Raj, S. S. Vinod Chandra","doi":"10.1007/s10930-023-10168-8","DOIUrl":"10.1007/s10930-023-10168-8","url":null,"abstract":"<div><p>Protein–protein interactions are crucial for the entry of viruses into the cell. Understanding the mechanism of interactions is essential in studying human-virus association, developing new biologics and drug candidates, as well as viral infections and antiviral responses. Experimental methods to analyze human-virus protein–protein interactions based on protein sequence data are time-consuming and labor-intensive, so machine learning models are being developed to predict interactions and determine large-scale interactomes between species. The present work highlights the importance of sequence features in classifying interacting and non-interacting proteins from the protein sequence data. Higher dimensional amino acid sequence features such as Amino Acid Composition (AAC), Dipeptide Composition (DPC), Grouped Amino Acid Composition (GAAC), Pseudo-Amino Acid Composition (PAAC) etc., are extracted. Following feature extraction, three datasets were created: Dataset 1 contains all of the extracted features. While Datasets 2 and 3 contain the most relevant features obtained through dimensionality reduction. To analyze the importance of high-dimensional features and their participation in protein–protein interactions, a random forest classifier is trained on three datasets. With dimensionality reduction, the model exhibited exceptional accuracy, indicating that dimensionality reduction fails to capture the complexity of interactions and the underlying relationships between human and viral proteins. As a result of retaining high-dimensional features, it is possible to capture all the characteristics of protein–protein interactions that resemble host–pathogen associations, leading to the development of biologically meaningful models. Our proposed approach is a more realistic and comprehensive classification model, leading to deeper insights and better applications in virology and drug development.</p></div>","PeriodicalId":793,"journal":{"name":"The Protein Journal","volume":"43 1","pages":"72 - 83"},"PeriodicalIF":1.9,"publicationDate":"2023-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138814808","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-12-08DOI: 10.1007/s10930-023-10167-9
Alok Kumar Pandey, Vishal Trivedi
Hemin, a byproduct of hemoglobin degradation, inflicts oxidative insult to cells. Following its accumulation, several proteins are recruited for heme detoxification with heme oxygenase playing the key role. Chaperones play a protective role primarily by preventing protein degradation and unfolding. They also are known to have miscellaneous secondary roles during similar situations. To discover a secondary role of chaperones during heme stress we studied the role of the chaperone HSPA8 in the detoxification of hemin. In-silico studies indicated that HSPA8 has a well-defined biophoric environment to bind hemin. Through optical difference spectroscopy, we found that HSPA8 binds hemin through its N-terminal domain with a Kd value of 5.9 ± 0.04 µM and transforms into a hemoprotein. The hemoprotein was tested for exhibiting peroxidase activity using guaiacol as substrate. The complex formed reacts with H2O2 and exhibits classical peroxidase activity with an ability to oxidize aromatic and halide substrates. HSPA8 is dose-dependently catalyzing heme polymerization through its N-terminal domain. The IR results reveal that the polymer formed exhibits structural similarities to β-hematin suggesting its covalent nature. The polymerization mechanism was tested through optical spectroscopy, spin-trap, and activity inhibition experiments. The results suggest that the polymerization occurs through a peroxidase-H2O2 system involving a one-electron transfer mechanism, and the formation of free radical and radical-radical interaction. It highlights a possible role of the HSPA8-hemin complex in exhibiting cytoprotective function during pathological conditions like malaria, sickle cell disease, etc.
{"title":"Role Transformation of HSPA8 to Heme-peroxidase After Binding Hemin to Catalyze Heme Polymerization","authors":"Alok Kumar Pandey, Vishal Trivedi","doi":"10.1007/s10930-023-10167-9","DOIUrl":"10.1007/s10930-023-10167-9","url":null,"abstract":"<div><p>Hemin, a byproduct of hemoglobin degradation, inflicts oxidative insult to cells. Following its accumulation, several proteins are recruited for heme detoxification with heme oxygenase playing the key role. Chaperones play a protective role primarily by preventing protein degradation and unfolding. They also are known to have miscellaneous secondary roles during similar situations. To discover a secondary role of chaperones during heme stress we studied the role of the chaperone HSPA8 in the detoxification of hemin. In-silico studies indicated that HSPA8 has a well-defined biophoric environment to bind hemin. Through optical difference spectroscopy, we found that HSPA8 binds hemin through its N-terminal domain with a K<sub>d</sub> value of 5.9 ± 0.04 µM and transforms into a hemoprotein. The hemoprotein was tested for exhibiting peroxidase activity using guaiacol as substrate. The complex formed reacts with H<sub>2</sub>O<sub>2</sub> and exhibits classical peroxidase activity with an ability to oxidize aromatic and halide substrates. HSPA8 is dose-dependently catalyzing heme polymerization through its N-terminal domain. The IR results reveal that the polymer formed exhibits structural similarities to β-hematin suggesting its covalent nature. The polymerization mechanism was tested through optical spectroscopy, spin-trap, and activity inhibition experiments. The results suggest that the polymerization occurs through a peroxidase-H<sub>2</sub>O<sub>2</sub> system involving a one-electron transfer mechanism, and the formation of free radical and radical-radical interaction. It highlights a possible role of the HSPA8-hemin complex in exhibiting cytoprotective function during pathological conditions like malaria, sickle cell disease, etc.</p></div>","PeriodicalId":793,"journal":{"name":"The Protein Journal","volume":"43 1","pages":"48 - 61"},"PeriodicalIF":1.9,"publicationDate":"2023-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138589886","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-12-08DOI: 10.1007/s10930-023-10162-0
Nicholas E. Burgis, Kandise VanWormer, Devin Robbins, Jonathan Smith
Recent clinical data have identified infant patients with lethal ITPA deficiencies. ITPA is known to modulate ITP concentrations in cells and has a critical function in neural development which is not understood. Polymorphism of the ITPA gene affects outcomes for both ribavirin and thiopurine based therapies and nearly one third of the human population is thought to harbor ITPA polymorphism. In a previous site-directed mutagenesis alanine screen of the ITPA substrate selectivity pocket, we identified the ITPA mutant, E22A, as a gain-of function mutant with enhanced ITP hydrolysis activity. Here we report a rational enzyme engineering experiment to investigate the biochemical properties of position 22 ITPA mutants and find that the E22D ITPA has two- and four-fold improved substrate selectivity for ITP over the canonical purine triphosphates ATP and GTP, respectively, while maintaining biological activity. The novel E22D ITPA should be considered as a platform for further development of ITPA therapies.
{"title":"An ITPA Enzyme with Improved Substrate Selectivity","authors":"Nicholas E. Burgis, Kandise VanWormer, Devin Robbins, Jonathan Smith","doi":"10.1007/s10930-023-10162-0","DOIUrl":"10.1007/s10930-023-10162-0","url":null,"abstract":"<div><p>Recent clinical data have identified infant patients with lethal ITPA deficiencies. ITPA is known to modulate ITP concentrations in cells and has a critical function in neural development which is not understood. Polymorphism of the <i>ITPA</i> gene affects outcomes for both ribavirin and thiopurine based therapies and nearly one third of the human population is thought to harbor <i>ITPA</i> polymorphism. In a previous site-directed mutagenesis alanine screen of the ITPA substrate selectivity pocket, we identified the ITPA mutant, E22A, as a gain-of function mutant with enhanced ITP hydrolysis activity. Here we report a rational enzyme engineering experiment to investigate the biochemical properties of position 22 ITPA mutants and find that the E22D ITPA has two- and four-fold improved substrate selectivity for ITP over the canonical purine triphosphates ATP and GTP, respectively, while maintaining biological activity. The novel E22D ITPA should be considered as a platform for further development of ITPA therapies.</p></div>","PeriodicalId":793,"journal":{"name":"The Protein Journal","volume":"43 1","pages":"62 - 71"},"PeriodicalIF":1.9,"publicationDate":"2023-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10930-023-10162-0.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138561658","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-11-28DOI: 10.21203/rs.3.rs-3121889/v1
M. Ghahramani, Mohammad Bagher Shahsavani, S. H. Khaleghinejad, Ali Niazi, A. Moosavi-Movahedi, Reza Yousefi
Angiotensin-converting enzyme 2 (ACE2) has a specific interaction with the coronavirus spike protein, enabling its entry into human cells. This membrane enzyme converts angiotensin II into angiotensin 1-7, which has an essential role in protecting the heart and improving lung function. Many therapeutic properties have been attributed to the human recombinant ACE2 (hrACE2), especially in combating complications related to diabetes mellitus and hypertension, as well as, preventing the coronavirus from entering the target tissues. In the current study, we designed an appropriate gene construct for the hybrid protein containing the ACE2 catalytic subunit and the B subunit of cholera toxin (CTB-ACE2). This structural feature will probably help the recombinant hybrid protein enter the mucosal tissues, including the lung tissue. Optimization of this hybrid protein expression was investigated in BL21 bacterial host cells. Also, the hybrid protein was identified with an appropriate antibody using the ELISA method. A large amount of the hybrid protein (molecular weight of ~ 100 kDa) was expressed as the inclusion body when the induction was performed in the presence of 0.25 mM IPTG and 1% sucrose for 10 h. Finally, the protein structural features were assessed using several biophysical methods. The fluorescence emission intensity and oligomeric size distribution of the CTB-ACE2 suggested a temperature-dependent alteration. The β-sheet and α-helix were also dominant in the hybrid protein structure, and this protein also displays acceptable chemical stability. In overall, according to our results, the efficient expression and successful purification of the CTB-ACE2 protein may pave the path for its therapeutic applications against diseases such as covid-19, diabetes mellitus and hypertension.
血管紧张素转换酶 2(ACE2)与冠状病毒尖峰蛋白有特殊的相互作用,使其能够进入人体细胞。这种膜酶可将血管紧张素 II 转化为血管紧张素 1-7,后者在保护心脏和改善肺功能方面发挥着重要作用。人重组 ACE2(hrACE2)具有许多治疗特性,特别是在防治糖尿病和高血压相关并发症以及阻止冠状病毒进入靶组织方面。在本研究中,我们设计了一种合适的基因构建体,用于构建含有 ACE2 催化亚基和霍乱毒素 B 亚基的杂交蛋白(CTB-ACE2)。这一结构特征可能有助于重组杂交蛋白进入包括肺组织在内的粘膜组织。研究人员在 BL21 细菌宿主细胞中优化了这种杂交蛋白的表达。此外,还使用 ELISA 方法用适当的抗体对杂交蛋白进行了鉴定。在 0.25 mM IPTG 和 1% 蔗糖存在下诱导 10 小时后,大量杂交蛋白(分子量约为 100 kDa)以包涵体形式表达。CTB-ACE2 的荧光发射强度和寡聚体大小分布表明其变化与温度有关。在杂交蛋白结构中,β-片层和α-螺旋也占主导地位,而且这种蛋白还显示出可接受的化学稳定性。总之,根据我们的研究结果,CTB-ACE2 蛋白的高效表达和成功纯化可能会为其治疗应用铺平道路,如用于治疗covid-19、糖尿病和高血压等疾病。
{"title":"Efficient Expression in the Prokaryotic Host System, Purification and Structural Analyses of the Recombinant Human ACE2 Catalytic Subunit as a Hybrid Protein with the B Subunit of Cholera Toxin (CTB-ACE2).","authors":"M. Ghahramani, Mohammad Bagher Shahsavani, S. H. Khaleghinejad, Ali Niazi, A. Moosavi-Movahedi, Reza Yousefi","doi":"10.21203/rs.3.rs-3121889/v1","DOIUrl":"https://doi.org/10.21203/rs.3.rs-3121889/v1","url":null,"abstract":"Angiotensin-converting enzyme 2 (ACE2) has a specific interaction with the coronavirus spike protein, enabling its entry into human cells. This membrane enzyme converts angiotensin II into angiotensin 1-7, which has an essential role in protecting the heart and improving lung function. Many therapeutic properties have been attributed to the human recombinant ACE2 (hrACE2), especially in combating complications related to diabetes mellitus and hypertension, as well as, preventing the coronavirus from entering the target tissues. In the current study, we designed an appropriate gene construct for the hybrid protein containing the ACE2 catalytic subunit and the B subunit of cholera toxin (CTB-ACE2). This structural feature will probably help the recombinant hybrid protein enter the mucosal tissues, including the lung tissue. Optimization of this hybrid protein expression was investigated in BL21 bacterial host cells. Also, the hybrid protein was identified with an appropriate antibody using the ELISA method. A large amount of the hybrid protein (molecular weight of ~ 100 kDa) was expressed as the inclusion body when the induction was performed in the presence of 0.25 mM IPTG and 1% sucrose for 10 h. Finally, the protein structural features were assessed using several biophysical methods. The fluorescence emission intensity and oligomeric size distribution of the CTB-ACE2 suggested a temperature-dependent alteration. The β-sheet and α-helix were also dominant in the hybrid protein structure, and this protein also displays acceptable chemical stability. In overall, according to our results, the efficient expression and successful purification of the CTB-ACE2 protein may pave the path for its therapeutic applications against diseases such as covid-19, diabetes mellitus and hypertension.","PeriodicalId":793,"journal":{"name":"The Protein Journal","volume":"1 1","pages":""},"PeriodicalIF":3.0,"publicationDate":"2023-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139226518","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Therapeutic proteins are potent, fast-acting drugs that are highly effective in treating various conditions. Medicinal protein usage has increased in the past 10 years, and it will evolve further as we better understand disease molecular pathways. However, it is associated with high processing costs, limited stability, difficulty in being administered as an oral medication, and the inability of large proteins to penetrate tissue and reach their target locations. Many methods have been developed to overcome the problems with the stability and chaperone activity of therapeutic proteins, viz., the addition of external agents (changing the properties of the surrounding solvent by using stabilizing excipients, e.g., amino acids, sugars, polyols) and internal agents (chemical modifications that influence its structural properties, e.g., mutations, glycosylation). However, these methods must completely clear protein instability and chaperone issues. There is still much work to be done on finetuning chaperone proteins to increase their biological efficacy and stability. Methylglyoxal (MGO), a potent dicarbonyl compound, reacts with proteins and forms covalent cross-links. Much research on MGO scavengers has been conducted since they are known to alter protein structure, which may result in alterations in biological activity and stability. MGO is naturally produced within our body, however, its impact on chaperones and protein stability needs to be better understood and seems to vary based on concentration. This review highlights the efforts of several research groups on the effect of MGO on various proteins. It also addresses the impact of MGO on a client protein, α-crystallin, to understand the potential solutions to the protein’s chaperone and stability problems.