Pub Date : 2024-09-01DOI: 10.1016/j.jmb.2024.168558
Actinobacteria undergo a complex multicellular life cycle and produce a wide range of specialized metabolites, including the majority of the antibiotics. These biological processes are controlled by intricate regulatory pathways, and to better understand how they are controlled we need to augment our insights into the transcription factor binding sites. Here, we present LogoMotif (https://logomotif.bioinformatics.nl), an open-source database for characterized and predicted transcription factor binding sites in Actinobacteria, along with their cognate position weight matrices and hidden Markov models. Genome-wide predictions of binding site locations in Streptomyces model organisms are supplied and visualized in interactive regulatory networks. In the web interface, users can freely access, download and investigate the underlying data. With this curated collection of actinobacterial regulatory interactions, LogoMotif serves as a basis for binding site predictions, thus providing users with clues on how to elicit the expression of genes of interest and guide genome mining efforts.
{"title":"LogoMotif: A Comprehensive Database of Transcription Factor Binding Site Profiles in Actinobacteria","authors":"","doi":"10.1016/j.jmb.2024.168558","DOIUrl":"10.1016/j.jmb.2024.168558","url":null,"abstract":"<div><p>Actinobacteria undergo a complex multicellular life cycle and produce a wide range of specialized metabolites, including the majority of the antibiotics. These biological processes are controlled by intricate regulatory pathways, and to better understand how they are controlled we need to augment our insights into the transcription factor binding sites. Here, we present LogoMotif (<span><span>https://logomotif.bioinformatics.nl</span><svg><path></path></svg></span>), an open-source database for characterized and predicted transcription factor binding sites in Actinobacteria, along with their cognate position weight matrices and hidden Markov models. Genome-wide predictions of binding site locations in <em>Streptomyces</em> model organisms are supplied and visualized in interactive regulatory networks. In the web interface, users can freely access, download and investigate the underlying data. With this curated collection of actinobacterial regulatory interactions, LogoMotif serves as a basis for binding site predictions, thus providing users with clues on how to elicit the expression of genes of interest and guide genome mining efforts.</p></div>","PeriodicalId":369,"journal":{"name":"Journal of Molecular Biology","volume":null,"pages":null},"PeriodicalIF":4.7,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0022283624001530/pdfft?md5=e68f2df3c3551ea4ff8a4a59b6f1dd2f&pid=1-s2.0-S0022283624001530-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140592030","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-01DOI: 10.1016/j.jmb.2024.168437
Typically, amyloid fibrils consist of multiple copies of the same protein. In these fibrils, each polypeptide chain adopts the same β-arc-containing conformation and these chains are stacked in a parallel and in-register manner. In the last few years, however, a considerable body of data has been accumulated about co-aggregation of different amyloid-forming proteins. Among known examples of the co-aggregation are heteroaggregates of different yeast prions and human proteins Rip1 and Rip3. Since the co-aggregation is linked to such important phenomena as infectivity of amyloids and molecular mechanisms of functional amyloids, we analyzed its structural aspects in more details. An axial stacking of different proteins within the same amyloid fibril is one of the most common type of co-aggregation. By using an approach based on structural similarity of the growing tips of amyloids, we developed a computational method to predict amyloidogenic β-arch structures that are able to interact with each other by the axial stacking. Furthermore, we compiled a dataset consisting of 26 experimentally known pairs of proteins capable or incapable to co-aggregate. We utilized this dataset to test and refine our algorithm. The developed method opens a way for a number of applications, including the identification of microbial proteins capable triggering amyloidosis in humans. AmyloComp is available on the website: https://bioinfo.crbm.cnrs.fr/index.php?route=tools&tool=30.
{"title":"AmyloComp: A Bioinformatic Tool for Prediction of Amyloid Co-aggregation","authors":"","doi":"10.1016/j.jmb.2024.168437","DOIUrl":"10.1016/j.jmb.2024.168437","url":null,"abstract":"<div><p>Typically, amyloid fibrils consist of multiple copies of the same protein. In these fibrils, each polypeptide chain adopts the same β-arc-containing conformation and these chains are stacked in a parallel and in-register manner. In the last few years, however, a considerable body of data has been accumulated about co-aggregation of different amyloid-forming proteins. Among known examples of the co-aggregation are heteroaggregates of different yeast prions and human proteins Rip1 and Rip3. Since the co-aggregation is linked to such important phenomena as infectivity of amyloids and molecular mechanisms of functional amyloids, we analyzed its structural aspects in more details. An axial stacking of different proteins within the same amyloid fibril is one of the most common type of co-aggregation. By using an approach based on structural similarity of the growing tips of amyloids, we developed a computational method to predict amyloidogenic β-arch structures that are able to interact with each other by the axial stacking. Furthermore, we compiled a dataset consisting of 26 experimentally known pairs of proteins capable or incapable to co-aggregate. We utilized this dataset to test and refine our algorithm. The developed method opens a way for a number of applications, including the identification of microbial proteins capable triggering amyloidosis in humans. AmyloComp is available on the website: <span><span>https://bioinfo.crbm.cnrs.fr/index.php?route=tools&tool=30</span><svg><path></path></svg></span>.</p></div>","PeriodicalId":369,"journal":{"name":"Journal of Molecular Biology","volume":null,"pages":null},"PeriodicalIF":4.7,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0022283624000032/pdfft?md5=7c4b0171bee8cb64ea160d5cea06ba57&pid=1-s2.0-S0022283624000032-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139104651","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-01DOI: 10.1016/j.jmb.2024.168540
Protein interactions are essential for cellular processes. In recent years there has been significant progress in computational prediction of 3D structures of individual protein chains, with the best-performing algorithms reaching sub-Ångström accuracy. These techniques are now finding their way into the prediction of protein interactions, adding to the existing modeling approaches. The community-wide Critical Assessment of Predicted Interactions (CAPRI) has been a catalyst for the development of procedures for the structural modeling of protein assemblies by organizing blind prediction experiments. The predicted structures are assessed against unpublished experimentally determined structures using a set of metrics with proven robustness that have been established in the CAPRI community. In addition, several advanced benchmarking databases provide targets against which users can test docking and assembly modeling software. These include the Protein-Protein Docking Benchmark, the CAPRI Scoreset, and the Dockground database, all developed by members of the CAPRI community. Here we present CAPRI-Q, a stand-alone model quality assessment tool, which can be freely downloaded or used via a publicly available web server. This tool applies the CAPRI metrics to assess the quality of query structures against given target structures, along with other popular quality metrics such as DockQ, TM-score and l-DDT, and classifies the models according to the CAPRI model quality criteria. The tool can handle a variety of protein complex types including those involving peptides, nucleic acids, and oligosaccharides. The source code is freely available from https://gitlab.in2p3.fr/cmsb-public/CAPRI-Q and its web interface through the Dockground resource at https://dockground.compbio.ku.edu/assessment/.
{"title":"CAPRI-Q: The CAPRI resource evaluating the quality of predicted structures of protein complexes","authors":"","doi":"10.1016/j.jmb.2024.168540","DOIUrl":"10.1016/j.jmb.2024.168540","url":null,"abstract":"<div><p>Protein interactions are essential for cellular processes. In recent years there has been significant progress in computational prediction of 3D structures of individual protein chains, with the best-performing algorithms reaching sub-Ångström accuracy. These techniques are now finding their way into the prediction of protein interactions, adding to the existing modeling approaches. The community-wide Critical Assessment of Predicted Interactions (CAPRI) has been a catalyst for the development of procedures for the structural modeling of protein assemblies by organizing blind prediction experiments. The predicted structures are assessed against unpublished experimentally determined structures using a set of metrics with proven robustness that have been established in the CAPRI community. In addition, several advanced benchmarking databases provide targets against which users can test docking and assembly modeling software. These include the Protein-Protein Docking Benchmark, the CAPRI Scoreset, and the <span>Dockground</span> database, all developed by members of the CAPRI community. Here we present CAPRI-Q, a stand-alone model quality assessment tool, which can be freely downloaded or used via a publicly available web server. This tool applies the CAPRI metrics to assess the quality of query structures against given target structures, along with other popular quality metrics such as DockQ, TM-score and <em>l</em>-DDT, and classifies the models according to the CAPRI model quality criteria. The tool can handle a variety of protein complex types including those involving peptides, nucleic acids, and oligosaccharides. The source code is freely available from <span><span>https://gitlab.in2p3.fr/cmsb-public/CAPRI-Q</span><svg><path></path></svg></span> and its web interface through the <span>Dockground</span> resource at <span><span>https://dockground.compbio.ku.edu/assessment/</span><svg><path></path></svg></span>.</p></div>","PeriodicalId":369,"journal":{"name":"Journal of Molecular Biology","volume":null,"pages":null},"PeriodicalIF":4.7,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0022283624001359/pdfft?md5=4b997150389807ec96ba0668e678acea&pid=1-s2.0-S0022283624001359-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140156597","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-01DOI: 10.1016/j.jmb.2024.168605
Prediction of the intrinsic disorder in protein sequences is an active research area, with well over 100 predictors that were released to date. These efforts are motivated by the functional importance and high levels of abundance of intrinsic disorder, combined with relatively low amounts of experimental annotations. The disorder predictors are periodically evaluated by independent assessors in the Critical Assessment of protein Intrinsic Disorder prediction (CAID) experiments. The recently completed CAID2 experiment assessed close to 40 state-of-the-art methods demonstrating that some of them produce accurate results. In particular, flDPnn2 method, which is the successor of flDPnn that performed well in the CAID1 experiment, secured the overall most accurate results on the Disorder-NOX dataset in CAID2. flDPnn2 implements a number of improvements when compared to its predecessor including changes to the inputs, increased size of the deep network model that we retrained on a larger training set, and addition of an alignment module. Using results from CAID2, we show that flDPnn2 produces accurate predictions very quickly, modestly improving over the accuracy of flDPnn and reducing the runtime by half, to about 27 s per protein. flDPnn2 is freely available as a convenient web server at http://biomine.cs.vcu.edu/servers/flDPnn2/.
{"title":"flDPnn2: Accurate and Fast Predictor of Intrinsic Disorder in Proteins","authors":"","doi":"10.1016/j.jmb.2024.168605","DOIUrl":"10.1016/j.jmb.2024.168605","url":null,"abstract":"<div><p>Prediction of the intrinsic disorder in protein sequences is an active research area, with well over 100 predictors that were released to date. These efforts are motivated by the functional importance and high levels of abundance of intrinsic disorder, combined with relatively low amounts of experimental annotations. The disorder predictors are periodically evaluated by independent assessors in the Critical Assessment of protein Intrinsic Disorder prediction (CAID) experiments. The recently completed CAID2 experiment assessed close to 40 state-of-the-art methods demonstrating that some of them produce accurate results. In particular, flDPnn2 method, which is the successor of flDPnn that performed well in the CAID1 experiment, secured the overall most accurate results on the Disorder-NOX dataset in CAID2. flDPnn2 implements a number of improvements when compared to its predecessor including changes to the inputs, increased size of the deep network model that we retrained on a larger training set, and addition of an alignment module. Using results from CAID2, we show that flDPnn2 produces accurate predictions very quickly, modestly improving over the accuracy of flDPnn and reducing the runtime by half, to about 27 s per protein. flDPnn2 is freely available as a convenient web server at <span><span>http://biomine.cs.vcu.edu/servers/flDPnn2/</span><svg><path></path></svg></span>.</p></div>","PeriodicalId":369,"journal":{"name":"Journal of Molecular Biology","volume":null,"pages":null},"PeriodicalIF":4.7,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0022283624002006/pdfft?md5=330905e4b9416747921c22b01cd0d82e&pid=1-s2.0-S0022283624002006-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140929176","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-01DOI: 10.1016/j.jmb.2024.168654
In the majority of downstream analysis pipelines for single-cell RNA sequencing (scRNA-seq), techniques like dimensionality reduction and feature selection are employed to address the problem of high-dimensional nature of the data. These approaches involve mapping the data onto a lower-dimensional space, eliminating less informative genes, and pinpointing the most pertinent features. This process ultimately leads to a reduction in the number of dimensions used for downstream analysis, which in turn speeds up the computation of large-scale scRNA-seq data. Most approaches are directed to isolate from biological background the genes characterizing different cells and or the condition under study by establishing lists of differentially expressed or coexpressed genes. Herein, we present scRNA-Explorer an open-source online tool for simplified and rapid scRNA-seq analysis designed with the end user in mind. scRNA-Explorer utilizes: (i) Filtering out uninformative cells in an interactive manner via a web interface, (ii) Gene correlation analysis coupled with an extra step of evaluating the biological importance of these correlations, and (iii) Gene enrichment analysis of correlated genes in order to find gene implication in specific functions. We developed a pipeline to address the above problem. The scRNA-Explorer pipeline allows users to interrogate in an interactive manner scRNA-sequencing data sets to explore via gene expression correlations possible function(s) of a gene of interest. scRNA-Explorer can be accessed at https://bioinformatics.med.uoc.gr/shinyapps/app/scrnaexplorer.
{"title":"scRNA-Explorer: An End-user Online Tool for Single Cell RNA-seq Data Analysis Featuring Gene Correlation and Data Filtering","authors":"","doi":"10.1016/j.jmb.2024.168654","DOIUrl":"10.1016/j.jmb.2024.168654","url":null,"abstract":"<div><p>In the majority of downstream analysis pipelines for single-cell RNA sequencing (scRNA-seq), techniques like dimensionality reduction and feature selection are employed to address the problem of high-dimensional nature of the data. These approaches involve mapping the data onto a lower-dimensional space, eliminating less informative genes, and pinpointing the most pertinent features. This process ultimately leads to a reduction in the number of dimensions used for downstream analysis, which in turn speeds up the computation of large-scale scRNA-seq data. Most approaches are directed to isolate from biological background the genes characterizing different cells and or the condition under study by establishing lists of differentially expressed or coexpressed genes. Herein, we present scRNA-Explorer an open-source online tool for simplified and rapid scRNA-seq analysis designed with the end user in mind. scRNA-Explorer utilizes: (i) Filtering out uninformative cells in an interactive manner via a web interface, (ii) Gene correlation analysis coupled with an extra step of evaluating the biological importance of these correlations, and (iii) Gene enrichment analysis of correlated genes in order to find gene implication in specific functions. We developed a pipeline to address the above problem. The scRNA-Explorer pipeline allows users to interrogate in an interactive manner scRNA-sequencing data sets to explore via gene expression correlations possible function(s) of a gene of interest. scRNA-Explorer can be accessed at <span><span>https://bioinformatics.med.uoc.gr/shinyapps/app/scrnaexplorer</span><svg><path></path></svg></span>.</p></div>","PeriodicalId":369,"journal":{"name":"Journal of Molecular Biology","volume":null,"pages":null},"PeriodicalIF":4.7,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0022283624002493/pdfft?md5=ac12f19f4529bc7cd7b91b26f0ebf3e8&pid=1-s2.0-S0022283624002493-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141390393","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-01DOI: 10.1016/j.jmb.2024.168656
Crosslinking mass spectrometry (MS) has emerged as an important technique for elucidating the in-solution structures of protein complexes and the topology of protein–protein interaction networks. However, the expanding user community lacked an integrated visualisation tool that helped them make use of the crosslinking data for investigating biological mechanisms. We addressed this need by developing xiVIEW, a web-based application designed to streamline crosslinking MS data analysis, which we present here. xiVIEW provides a user-friendly interface for accessing coordinated views of mass spectrometric data, network visualisation, annotations extracted from trusted repositories like UniProtKB, and available 3D structures. In accordance with recent recommendations from the crosslinking MS community, xiVIEW (i) provides a standards compliant parser to improve data integration and (ii) offers accessible visualisation tools. By promoting the adoption of standard file formats and providing a comprehensive visualisation platform, xiVIEW empowers both experimentalists and modellers alike to pursue their respective research interests. We anticipate that xiVIEW will advance crosslinking MS-inspired research, and facilitate broader and more effective investigations into complex biological systems.
{"title":"xiVIEW: Visualisation of Crosslinking Mass Spectrometry Data","authors":"","doi":"10.1016/j.jmb.2024.168656","DOIUrl":"10.1016/j.jmb.2024.168656","url":null,"abstract":"<div><p>Crosslinking mass spectrometry (MS) has emerged as an important technique for elucidating the in-solution structures of protein complexes and the topology of protein–protein interaction networks. However, the expanding user community lacked an integrated visualisation tool that helped them make use of the crosslinking data for investigating biological mechanisms. We addressed this need by developing xiVIEW, a web-based application designed to streamline crosslinking MS data analysis, which we present here. xiVIEW provides a user-friendly interface for accessing coordinated views of mass spectrometric data, network visualisation, annotations extracted from trusted repositories like UniProtKB, and available 3D structures. In accordance with recent recommendations from the crosslinking MS community, xiVIEW (i) provides a standards compliant parser to improve data integration and (ii) offers accessible visualisation tools. By promoting the adoption of standard file formats and providing a comprehensive visualisation platform, xiVIEW empowers both experimentalists and modellers alike to pursue their respective research interests. We anticipate that xiVIEW will advance crosslinking MS-inspired research, and facilitate broader and more effective investigations into complex biological systems.</p></div>","PeriodicalId":369,"journal":{"name":"Journal of Molecular Biology","volume":null,"pages":null},"PeriodicalIF":4.7,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0022283624002511/pdfft?md5=acab584de04c1897e54dfc9d1552c268&pid=1-s2.0-S0022283624002511-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141530289","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-01DOI: 10.1016/j.jmb.2024.168772
Yuantao Huo , Rishabh Karnawat , Lixia Liu , Robert A. Knieß , Maike Groß , Xuemei Chen , Matthias P. Mayer
The highly conserved Hsp90 chaperones control stability and activity of many essential signaling and regulatory proteins including many protein kinases, E3 ligases and transcription factors. Thereby, Hsp90s couple cellular homeostasis of the proteome to cell fate decisions. High-throughput mass spectrometry revealed 178 and 169 posttranslational modifications (PTMs) for human cytosolic Hsp90α and Hsp90β, but for only a few of the modifications the physiological consequences are investigated in some detail. In this study, we explored the suitability of the yeast model system for the identification of key regulatory residues in human Hsp90α. Replacement of three tyrosine residues known to be phosphorylated by phosphomimetic glutamate and by non-phosphorylatable phenylalanine individually and in combination influenced yeast growth and the maturation of 7 different Hsp90 clients in distinct ways. Furthermore, wild-type and mutant Hsp90 differed in their ability to stabilize known clients when expressed in HepG2 HSP90AA1−/− cells. The purified mutant proteins differed in their interaction with the cochaperones Aha1, Cdc37, Hop and p23 and in their support of the maturation of glucocorticoid receptor ligand binding domain in vitro. In vivo and in vitro data correspond well to each other confirming that the yeast system is suitable for the identification of key regulatory sites in human Hsp90s. Our findings indicate that even closely related clients are affected differently by the amino acid replacements in the investigated positions, suggesting that PTMs could bias Hsp90s client specificity.
{"title":"Modification of Regulatory Tyrosine Residues Biases Human Hsp90α in its Interactions with Cochaperones and Clients","authors":"Yuantao Huo , Rishabh Karnawat , Lixia Liu , Robert A. Knieß , Maike Groß , Xuemei Chen , Matthias P. Mayer","doi":"10.1016/j.jmb.2024.168772","DOIUrl":"10.1016/j.jmb.2024.168772","url":null,"abstract":"<div><p>The highly conserved Hsp90 chaperones control stability and activity of many essential signaling and regulatory proteins including many protein kinases, E3 ligases and transcription factors. Thereby, Hsp90s couple cellular homeostasis of the proteome to cell fate decisions. High-throughput mass spectrometry revealed 178 and 169 posttranslational modifications (PTMs) for human cytosolic Hsp90α and Hsp90β, but for only a few of the modifications the physiological consequences are investigated in some detail. In this study, we explored the suitability of the yeast model system for the identification of key regulatory residues in human Hsp90α. Replacement of three tyrosine residues known to be phosphorylated by phosphomimetic glutamate and by non-phosphorylatable phenylalanine individually and in combination influenced yeast growth and the maturation of 7 different Hsp90 clients in distinct ways. Furthermore, wild-type and mutant Hsp90 differed in their ability to stabilize known clients when expressed in HepG2 <em>HSP90AA1</em><sup>−/−</sup> cells. The purified mutant proteins differed in their interaction with the cochaperones Aha1, Cdc37, Hop and p23 and in their support of the maturation of glucocorticoid receptor ligand binding domain <em>in vitro</em>. <em>In vivo</em> and <em>in vitro</em> data correspond well to each other confirming that the yeast system is suitable for the identification of key regulatory sites in human Hsp90s. Our findings indicate that even closely related clients are affected differently by the amino acid replacements in the investigated positions, suggesting that PTMs could bias Hsp90s client specificity.</p></div>","PeriodicalId":369,"journal":{"name":"Journal of Molecular Biology","volume":null,"pages":null},"PeriodicalIF":4.7,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0022283624003929/pdfft?md5=69c81d021431476f9dc084f49c84518e&pid=1-s2.0-S0022283624003929-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142118639","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-30DOI: 10.1016/j.jmb.2024.168768
Mohammed Bergoug , Christine Mosrin , Amandine Serrano, Fabienne Godin, Michel Doudeau, Iva Dundović, Stephane Goffinont, Thierry Normand, Marcin J. Suskiewicz, Béatrice Vallée, Hélène Bénédetti
Neurofibromin (Nf1) is a giant multidomain protein encoded by the tumour-suppressor gene NF1. NF1 is mutated in a common genetic disease, neurofibromatosis type I (NF1), and in various cancers. The protein has a Ras-GAP (GTPase activating protein) activity but is also connected to diverse signalling pathways through its SecPH domain, which interacts with lipids and different protein partners. We previously showed that Nf1 partially colocalized with the ProMyelocytic Leukemia (PML) protein in PML nuclear bodies, hotspots of SUMOylation, thereby suggesting the potential SUMOylation of Nf1. Here, we demonstrate that the full-length isoform 2 and a SecPH fragment of Nf1 are substrates of the SUMO pathway and identify a well-defined SUMOylation profile of SecPH with two main modified lysines. One of these sites, K1731, is highly conserved and surface-exposed. Despite the presence of an inverted SUMO consensus motif surrounding K1731, and a potential SUMO-interacting motif (SIM) within SecPH, we show that neither of these elements is necessary for K1731 SUMOylation, which is also independent of Ubc9 SUMOylation on K14. A 3D model of an interaction between SecPH and Ubc9 centred on K1731, combined with site-directed mutagenesis, identifies specific structural elements of SecPH required for K1731 SUMOylation, some of which are affected in reported NF1 pathogenic variants. This work provides a new example of SUMOylation dependent on the tertiary rather than primary protein structure surrounding the modified site, expanding our knowledge of mechanisms governing SUMOylation site selection.
{"title":"An Atypical Mechanism of SUMOylation of Neurofibromin SecPH Domain Provides New Insights into SUMOylation Site Selection","authors":"Mohammed Bergoug , Christine Mosrin , Amandine Serrano, Fabienne Godin, Michel Doudeau, Iva Dundović, Stephane Goffinont, Thierry Normand, Marcin J. Suskiewicz, Béatrice Vallée, Hélène Bénédetti","doi":"10.1016/j.jmb.2024.168768","DOIUrl":"10.1016/j.jmb.2024.168768","url":null,"abstract":"<div><p>Neurofibromin (Nf1) is a giant multidomain protein encoded by the tumour-suppressor gene <em>NF1</em>. <em>NF1</em> is mutated in a common genetic disease, neurofibromatosis type I (NF1), and in various cancers. The protein has a Ras-GAP (GTPase activating protein) activity but is also connected to diverse signalling pathways through its SecPH domain, which interacts with lipids and different protein partners. We previously showed that Nf1 partially colocalized with the ProMyelocytic Leukemia (PML) protein in PML nuclear bodies, hotspots of SUMOylation, thereby suggesting the potential SUMOylation of Nf1. Here, we demonstrate that the full-length isoform 2 and a SecPH fragment of Nf1 are substrates of the SUMO pathway and identify a well-defined SUMOylation profile of SecPH with two main modified lysines. One of these sites, K1731, is highly conserved and surface-exposed. Despite the presence of an inverted SUMO consensus motif surrounding K1731, and a potential SUMO-interacting motif (SIM) within SecPH, we show that neither of these elements is necessary for K1731 SUMOylation, which is also independent of Ubc9 SUMOylation on K14. A 3D model of an interaction between SecPH and Ubc9 centred on K1731, combined with site-directed mutagenesis, identifies specific structural elements of SecPH required for K1731 SUMOylation, some of which are affected in reported <em>NF1</em> pathogenic variants. This work provides a new example of SUMOylation dependent on the tertiary rather than primary protein structure surrounding the modified site, expanding our knowledge of mechanisms governing SUMOylation site selection.</p></div>","PeriodicalId":369,"journal":{"name":"Journal of Molecular Biology","volume":null,"pages":null},"PeriodicalIF":4.7,"publicationDate":"2024-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0022283624003887/pdfft?md5=9ede31663fa2a8fcc52d7dc5e454c0f0&pid=1-s2.0-S0022283624003887-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142102998","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-30DOI: 10.1016/j.jmb.2024.168771
Elsa D.M. Hien , Patrick St-Pierre , J. Carlos Penedo , Daniel A. Lafontaine
Transcription elongation is one of the most important processes in the cell. During RNA polymerase elongation, the folding of nascent transcripts plays crucial roles in the genetic decision. Bacterial riboswitches are prime examples of RNA regulators that control gene expression by altering their structure upon metabolite sensing. It was previously revealed that the thiamin pyrophosphate-sensing tbpA riboswitch in Escherichia coli cotranscriptionally adopts three main structures leading to metabolite sensing. Here, using single-molecule FRET, we characterize the transition in which the first nascent structure, a 5′ stem-loop, is unfolded during transcription elongation to form the ligand-binding competent structure. Our results suggest that the structural transition occurs in a relatively abrupt manner, i.e., within a 1–2 nucleotide window. Furthermore, a highly dynamic structural exchange is observed, indicating that riboswitch transcripts perform rapid sampling of nascent co-occurring structures. We also observe that the presence of the RNAP stabilizes the 5′ stem-loop along the elongation process, consistent with RNAP interacting with the 5′ stem-loop. Our study emphasizes the role of early folding stem-loop structures in the cotranscriptional formation of complex RNA molecules involved in genetic regulation.
{"title":"Cotranscriptional Folding of a 5′ Stem-loop in the Escherichia coli tbpA Riboswitch at Single-nucleotide Resolution","authors":"Elsa D.M. Hien , Patrick St-Pierre , J. Carlos Penedo , Daniel A. Lafontaine","doi":"10.1016/j.jmb.2024.168771","DOIUrl":"10.1016/j.jmb.2024.168771","url":null,"abstract":"<div><p>Transcription elongation is one of the most important processes in the cell. During RNA polymerase elongation, the folding of nascent transcripts plays crucial roles in the genetic decision. Bacterial riboswitches are prime examples of RNA regulators that control gene expression by altering their structure upon metabolite sensing. It was previously revealed that the thiamin pyrophosphate-sensing <em>tbpA</em> riboswitch in <em>Escherichia coli</em> cotranscriptionally adopts three main structures leading to metabolite sensing. Here, using single-molecule FRET, we characterize the transition in which the first nascent structure, a 5′ stem-loop, is unfolded during transcription elongation to form the ligand-binding competent structure. Our results suggest that the structural transition occurs in a relatively abrupt manner, <em>i.e.</em>, within a 1–2 nucleotide window. Furthermore, a highly dynamic structural exchange is observed, indicating that riboswitch transcripts perform rapid sampling of nascent co-occurring structures. We also observe that the presence of the RNAP stabilizes the 5′ stem-loop along the elongation process, consistent with RNAP interacting with the 5′ stem-loop. Our study emphasizes the role of early folding stem-loop structures in the cotranscriptional formation of complex RNA molecules involved in genetic regulation.</p></div>","PeriodicalId":369,"journal":{"name":"Journal of Molecular Biology","volume":null,"pages":null},"PeriodicalIF":4.7,"publicationDate":"2024-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0022283624003917/pdfft?md5=4b592b408e4fe1c2e33ab0984a7938a4&pid=1-s2.0-S0022283624003917-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142102999","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-29DOI: 10.1016/j.jmb.2024.168769
Yu-Chen Liu , Yi-Jing Lin , Yan-Yun Chang , Cheng-Che Chuang , Yu-Yen Ou
Deciphering the mechanisms governing protein-DNA interactions is crucial for understanding key cellular processes and disease pathways. In this work, we present a powerful deep learning approach that significantly advances the computational prediction of DNA-interacting residues from protein sequences.
Our method leverages the rich contextual representations learned by pre-trained protein language models, such as ProtTrans, to capture intrinsic biochemical properties and sequence motifs indicative of DNA binding sites. We then integrate these contextual embeddings with a multi-window convolutional neural network architecture, which scans across the sequence at varying window sizes to effectively identify both local and global binding patterns.
Comprehensive evaluation on curated benchmark datasets demonstrates the remarkable performance of our approach, achieving an area under the ROC curve (AUC) of 0.89 – a substantial improvement over previous state-of-the-art sequence-based predictors. This showcases the immense potential of pairing advanced representation learning and deep neural network designs for uncovering the complex syntax governing protein-DNA interactions directly from primary sequences.
Our work not only provides a robust computational tool for characterizing DNA-binding mechanisms, but also highlights the transformative opportunities at the intersection of language modeling, deep learning, and protein sequence analysis. The publicly available code and data further facilitate broader adoption and continued development of these techniques for accelerating mechanistic insights into vital biological processes and disease pathways.
In addition, the code and data for this work are available at https://github.com/B1607/DIRP.
破译蛋白质与 DNA 的相互作用机制对于理解关键的细胞过程和疾病途径至关重要。在这项工作中,我们提出了一种强大的深度学习方法,大大推进了对蛋白质序列中 DNA 相互作用残基的计算预测。我们的方法利用了预先训练的蛋白质语言模型(如 ProtTrans)所学习到的丰富上下文表征,以捕捉表明 DNA 结合位点的内在生化特性和序列图案。然后,我们将这些上下文嵌入与多窗口卷积神经网络架构相结合,该架构以不同的窗口大小扫描整个序列,从而有效识别局部和全局结合模式。在经过策划的基准数据集上进行的综合评估表明,我们的方法性能卓越,ROC 曲线下面积(AUC)达到了 0.89,比以前最先进的基于序列的预测方法有了大幅提高。这展示了先进的表示学习和深度神经网络设计在直接从主序列揭示支配蛋白质-DNA 相互作用的复杂语法方面的巨大潜力。我们的工作不仅为表征 DNA 结合机制提供了强大的计算工具,还凸显了语言建模、深度学习和蛋白质序列分析交叉领域的变革机遇。公开的代码和数据进一步促进了这些技术的广泛应用和持续发展,加快了对重要生物过程和疾病途径的机理认识。此外,这项工作的代码和数据可在 https://github.com/B1607/DIRP 上获取。
{"title":"Deciphering the Language of Protein-DNA Interactions: A Deep Learning Approach Combining Contextual Embeddings and Multi-Scale Sequence Modeling","authors":"Yu-Chen Liu , Yi-Jing Lin , Yan-Yun Chang , Cheng-Che Chuang , Yu-Yen Ou","doi":"10.1016/j.jmb.2024.168769","DOIUrl":"10.1016/j.jmb.2024.168769","url":null,"abstract":"<div><p>Deciphering the mechanisms governing protein-DNA interactions is crucial for understanding key cellular processes and disease pathways. In this work, we present a powerful deep learning approach that significantly advances the computational prediction of DNA-interacting residues from protein sequences.</p><p>Our method leverages the rich contextual representations learned by pre-trained protein language models, such as ProtTrans, to capture intrinsic biochemical properties and sequence motifs indicative of DNA binding sites. We then integrate these contextual embeddings with a multi-window convolutional neural network architecture, which scans across the sequence at varying window sizes to effectively identify both local and global binding patterns.</p><p>Comprehensive evaluation on curated benchmark datasets demonstrates the remarkable performance of our approach, achieving an area under the ROC curve (AUC) of 0.89 – a substantial improvement over previous state-of-the-art sequence-based predictors. This showcases the immense potential of pairing advanced representation learning and deep neural network designs for uncovering the complex syntax governing protein-DNA interactions directly from primary sequences.</p><p>Our work not only provides a robust computational tool for characterizing DNA-binding mechanisms, but also highlights the transformative opportunities at the intersection of language modeling, deep learning, and protein sequence analysis. The publicly available code and data further facilitate broader adoption and continued development of these techniques for accelerating mechanistic insights into vital biological processes and disease pathways.</p><p>In addition, the code and data for this work are available at <span><span>https://github.com/B1607/DIRP</span><svg><path></path></svg></span>.</p></div>","PeriodicalId":369,"journal":{"name":"Journal of Molecular Biology","volume":null,"pages":null},"PeriodicalIF":4.7,"publicationDate":"2024-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0022283624003899/pdfft?md5=48e8a1f78b82ff4e5d3d37956f6b0f26&pid=1-s2.0-S0022283624003899-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142103000","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}