首页 > 最新文献

Journal of Molecular Biology最新文献

英文 中文
LogoMotif: A Comprehensive Database of Transcription Factor Binding Site Profiles in Actinobacteria LogoMotif:放线菌转录因子结合位点图谱综合数据库
IF 4.7 2区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2024-09-01 DOI: 10.1016/j.jmb.2024.168558

Actinobacteria undergo a complex multicellular life cycle and produce a wide range of specialized metabolites, including the majority of the antibiotics. These biological processes are controlled by intricate regulatory pathways, and to better understand how they are controlled we need to augment our insights into the transcription factor binding sites. Here, we present LogoMotif (https://logomotif.bioinformatics.nl), an open-source database for characterized and predicted transcription factor binding sites in Actinobacteria, along with their cognate position weight matrices and hidden Markov models. Genome-wide predictions of binding site locations in Streptomyces model organisms are supplied and visualized in interactive regulatory networks. In the web interface, users can freely access, download and investigate the underlying data. With this curated collection of actinobacterial regulatory interactions, LogoMotif serves as a basis for binding site predictions, thus providing users with clues on how to elicit the expression of genes of interest and guide genome mining efforts.

放线菌经历复杂的多细胞生命周期,并产生多种特殊代谢产物,包括大多数抗生素。这些生物过程由错综复杂的调控途径控制,为了更好地了解它们是如何被控制的,我们需要加强对转录因子结合位点的了解。在这里,我们介绍一个开源数据库 LogoMotif(),该数据库收录了放线菌中表征和预测的转录因子结合位点,以及它们的同源位置权重矩阵和隐马尔可夫模型。在交互式调控网络中,提供了对模式生物中结合位点位置的全基因组预测,并将其可视化。在网络界面上,用户可以自由访问、下载和研究基础数据。LogoMotif 收集了大量放线菌的调控相互作用,可作为结合位点预测的基础,从而为用户提供如何诱导相关基因表达的线索,并指导基因组挖掘工作。
{"title":"LogoMotif: A Comprehensive Database of Transcription Factor Binding Site Profiles in Actinobacteria","authors":"","doi":"10.1016/j.jmb.2024.168558","DOIUrl":"10.1016/j.jmb.2024.168558","url":null,"abstract":"<div><p>Actinobacteria undergo a complex multicellular life cycle and produce a wide range of specialized metabolites, including the majority of the antibiotics. These biological processes are controlled by intricate regulatory pathways, and to better understand how they are controlled we need to augment our insights into the transcription factor binding sites. Here, we present LogoMotif (<span><span>https://logomotif.bioinformatics.nl</span><svg><path></path></svg></span>), an open-source database for characterized and predicted transcription factor binding sites in Actinobacteria, along with their cognate position weight matrices and hidden Markov models. Genome-wide predictions of binding site locations in <em>Streptomyces</em> model organisms are supplied and visualized in interactive regulatory networks. In the web interface, users can freely access, download and investigate the underlying data. With this curated collection of actinobacterial regulatory interactions, LogoMotif serves as a basis for binding site predictions, thus providing users with clues on how to elicit the expression of genes of interest and guide genome mining efforts.</p></div>","PeriodicalId":369,"journal":{"name":"Journal of Molecular Biology","volume":null,"pages":null},"PeriodicalIF":4.7,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0022283624001530/pdfft?md5=e68f2df3c3551ea4ff8a4a59b6f1dd2f&pid=1-s2.0-S0022283624001530-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140592030","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
AmyloComp: A Bioinformatic Tool for Prediction of Amyloid Co-aggregation AmyloComp:预测淀粉样蛋白共聚集的生物信息学工具
IF 4.7 2区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2024-09-01 DOI: 10.1016/j.jmb.2024.168437

Typically, amyloid fibrils consist of multiple copies of the same protein. In these fibrils, each polypeptide chain adopts the same β-arc-containing conformation and these chains are stacked in a parallel and in-register manner. In the last few years, however, a considerable body of data has been accumulated about co-aggregation of different amyloid-forming proteins. Among known examples of the co-aggregation are heteroaggregates of different yeast prions and human proteins Rip1 and Rip3. Since the co-aggregation is linked to such important phenomena as infectivity of amyloids and molecular mechanisms of functional amyloids, we analyzed its structural aspects in more details. An axial stacking of different proteins within the same amyloid fibril is one of the most common type of co-aggregation. By using an approach based on structural similarity of the growing tips of amyloids, we developed a computational method to predict amyloidogenic β-arch structures that are able to interact with each other by the axial stacking. Furthermore, we compiled a dataset consisting of 26 experimentally known pairs of proteins capable or incapable to co-aggregate. We utilized this dataset to test and refine our algorithm. The developed method opens a way for a number of applications, including the identification of microbial proteins capable triggering amyloidosis in humans. AmyloComp is available on the website: https://bioinfo.crbm.cnrs.fr/index.php?route=tools&tool=30.

通常情况下,淀粉样蛋白纤维由同一蛋白质的多个拷贝组成。在这些纤维中,每条多肽链都采用相同的β-弧形构象,并且这些多肽链以平行和整齐的方式堆叠在一起。然而,在过去几年中,有关不同淀粉样蛋白共聚集的数据已经积累了相当多。已知的共同聚集实例包括不同酵母朊病毒和人类蛋白质 Rip1 和 Rip3 的异质聚集。由于共聚集与淀粉样蛋白的感染性和功能性淀粉样蛋白的分子机制等重要现象有关,我们对其结构方面进行了更详细的分析。同一淀粉样纤维内不同蛋白质的轴向堆积是最常见的共聚集类型之一。通过使用基于淀粉样蛋白生长尖端结构相似性的方法,我们开发了一种计算方法来预测能够通过轴向堆积相互作用的淀粉样蛋白β-arch结构。此外,我们还编制了一个数据集,其中包括 26 对实验已知的能或不能共同聚集的蛋白质。我们利用这个数据集来测试和完善我们的算法。所开发的方法为许多应用开辟了道路,包括识别能够引发人类淀粉样变性的微生物蛋白质。AmyloComp 可在网站 https://bioinfo.crbm.cnrs.fr/index.php?route=tools&tool=30 上获取。
{"title":"AmyloComp: A Bioinformatic Tool for Prediction of Amyloid Co-aggregation","authors":"","doi":"10.1016/j.jmb.2024.168437","DOIUrl":"10.1016/j.jmb.2024.168437","url":null,"abstract":"<div><p>Typically, amyloid fibrils consist of multiple copies of the same protein. In these fibrils, each polypeptide chain adopts the same β-arc-containing conformation and these chains are stacked in a parallel and in-register manner. In the last few years, however, a considerable body of data has been accumulated about co-aggregation of different amyloid-forming proteins. Among known examples of the co-aggregation are heteroaggregates of different yeast prions and human proteins Rip1 and Rip3. Since the co-aggregation is linked to such important phenomena as infectivity of amyloids and molecular mechanisms of functional amyloids, we analyzed its structural aspects in more details. An axial stacking of different proteins within the same amyloid fibril is one of the most common type of co-aggregation. By using an approach based on structural similarity of the growing tips of amyloids, we developed a computational method to predict amyloidogenic β-arch structures that are able to interact with each other by the axial stacking. Furthermore, we compiled a dataset consisting of 26 experimentally known pairs of proteins capable or incapable to co-aggregate. We utilized this dataset to test and refine our algorithm. The developed method opens a way for a number of applications, including the identification of microbial proteins capable triggering amyloidosis in humans. AmyloComp is available on the website: <span><span>https://bioinfo.crbm.cnrs.fr/index.php?route=tools&amp;tool=30</span><svg><path></path></svg></span>.</p></div>","PeriodicalId":369,"journal":{"name":"Journal of Molecular Biology","volume":null,"pages":null},"PeriodicalIF":4.7,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0022283624000032/pdfft?md5=7c4b0171bee8cb64ea160d5cea06ba57&pid=1-s2.0-S0022283624000032-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139104651","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CAPRI-Q: The CAPRI resource evaluating the quality of predicted structures of protein complexes CAPRI-Q:评估蛋白质复合体预测结构质量的 CAPRI 资源
IF 4.7 2区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2024-09-01 DOI: 10.1016/j.jmb.2024.168540

Protein interactions are essential for cellular processes. In recent years there has been significant progress in computational prediction of 3D structures of individual protein chains, with the best-performing algorithms reaching sub-Ångström accuracy. These techniques are now finding their way into the prediction of protein interactions, adding to the existing modeling approaches. The community-wide Critical Assessment of Predicted Interactions (CAPRI) has been a catalyst for the development of procedures for the structural modeling of protein assemblies by organizing blind prediction experiments. The predicted structures are assessed against unpublished experimentally determined structures using a set of metrics with proven robustness that have been established in the CAPRI community. In addition, several advanced benchmarking databases provide targets against which users can test docking and assembly modeling software. These include the Protein-Protein Docking Benchmark, the CAPRI Scoreset, and the Dockground database, all developed by members of the CAPRI community. Here we present CAPRI-Q, a stand-alone model quality assessment tool, which can be freely downloaded or used via a publicly available web server. This tool applies the CAPRI metrics to assess the quality of query structures against given target structures, along with other popular quality metrics such as DockQ, TM-score and l-DDT, and classifies the models according to the CAPRI model quality criteria. The tool can handle a variety of protein complex types including those involving peptides, nucleic acids, and oligosaccharides. The source code is freely available from https://gitlab.in2p3.fr/cmsb-public/CAPRI-Q and its web interface through the Dockground resource at https://dockground.compbio.ku.edu/assessment/.

蛋白质相互作用对细胞过程至关重要。近年来,单个蛋白质链三维结构的计算预测取得了重大进展,性能最好的算法达到了亚Ångström精度。现在,这些技术正被用于蛋白质相互作用的预测,为现有的建模方法锦上添花。通过组织盲预测实验,全社区范围的 "预测相互作用关键评估"(CAPRI)推动了蛋白质组装结构建模程序的发展。利用 CAPRI 社区已经建立的一套经过验证的稳健性指标,根据未公布的实验测定结构对预测结构进行评估。此外,几个先进的基准数据库还提供了目标,用户可以根据这些目标测试对接和装配建模软件。这些数据库包括蛋白质-蛋白质对接基准、CAPRI Scoreset 和 Dockground 数据库,均由 CAPRI 社区成员开发。在此,我们介绍一款独立的模型质量评估工具 CAPRI-Q,该工具可免费下载或通过公开的网络服务器使用。该工具应用 CAPRI 指标以及 DockQ、TM-score 和 l-DDT 等其他流行的质量指标,根据给定的目标结构评估查询结构的质量,并根据 CAPRI 模型质量标准对模型进行分类。该工具可以处理各种蛋白质复合物类型,包括涉及肽、核酸和寡糖的复合物。源代码可从 https://gitlab.in2p3.fr/cmsb-public/CAPRI-Q 免费获取,其网络接口可从 https://dockground.compbio.ku.edu/assessment/ 的 Dockground 资源获取。
{"title":"CAPRI-Q: The CAPRI resource evaluating the quality of predicted structures of protein complexes","authors":"","doi":"10.1016/j.jmb.2024.168540","DOIUrl":"10.1016/j.jmb.2024.168540","url":null,"abstract":"<div><p>Protein interactions are essential for cellular processes. In recent years there has been significant progress in computational prediction of 3D structures of individual protein chains, with the best-performing algorithms reaching sub-Ångström accuracy. These techniques are now finding their way into the prediction of protein interactions, adding to the existing modeling approaches. The community-wide Critical Assessment of Predicted Interactions (CAPRI) has been a catalyst for the development of procedures for the structural modeling of protein assemblies by organizing blind prediction experiments. The predicted structures are assessed against unpublished experimentally determined structures using a set of metrics with proven robustness that have been established in the CAPRI community. In addition, several advanced benchmarking databases provide targets against which users can test docking and assembly modeling software. These include the Protein-Protein Docking Benchmark, the CAPRI Scoreset, and the <span>Dockground</span> database, all developed by members of the CAPRI community. Here we present CAPRI-Q, a stand-alone model quality assessment tool, which can be freely downloaded or used via a publicly available web server. This tool applies the CAPRI metrics to assess the quality of query structures against given target structures, along with other popular quality metrics such as DockQ, TM-score and <em>l</em>-DDT, and classifies the models according to the CAPRI model quality criteria. The tool can handle a variety of protein complex types including those involving peptides, nucleic acids, and oligosaccharides. The source code is freely available from <span><span>https://gitlab.in2p3.fr/cmsb-public/CAPRI-Q</span><svg><path></path></svg></span> and its web interface through the <span>Dockground</span> resource at <span><span>https://dockground.compbio.ku.edu/assessment/</span><svg><path></path></svg></span>.</p></div>","PeriodicalId":369,"journal":{"name":"Journal of Molecular Biology","volume":null,"pages":null},"PeriodicalIF":4.7,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0022283624001359/pdfft?md5=4b997150389807ec96ba0668e678acea&pid=1-s2.0-S0022283624001359-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140156597","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
flDPnn2: Accurate and Fast Predictor of Intrinsic Disorder in Proteins flDPnn2:准确而快速的蛋白质内在紊乱预测器
IF 4.7 2区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2024-09-01 DOI: 10.1016/j.jmb.2024.168605

Prediction of the intrinsic disorder in protein sequences is an active research area, with well over 100 predictors that were released to date. These efforts are motivated by the functional importance and high levels of abundance of intrinsic disorder, combined with relatively low amounts of experimental annotations. The disorder predictors are periodically evaluated by independent assessors in the Critical Assessment of protein Intrinsic Disorder prediction (CAID) experiments. The recently completed CAID2 experiment assessed close to 40 state-of-the-art methods demonstrating that some of them produce accurate results. In particular, flDPnn2 method, which is the successor of flDPnn that performed well in the CAID1 experiment, secured the overall most accurate results on the Disorder-NOX dataset in CAID2. flDPnn2 implements a number of improvements when compared to its predecessor including changes to the inputs, increased size of the deep network model that we retrained on a larger training set, and addition of an alignment module. Using results from CAID2, we show that flDPnn2 produces accurate predictions very quickly, modestly improving over the accuracy of flDPnn and reducing the runtime by half, to about 27 s per protein. flDPnn2 is freely available as a convenient web server at http://biomine.cs.vcu.edu/servers/flDPnn2/.

蛋白质序列的内在无序性预测是一个活跃的研究领域,迄今为止已发布了 100 多个预测器。这些工作的动力来自于内在紊乱的功能重要性和高丰度,以及相对较少的实验注释。在蛋白质内在紊乱预测关键评估(CAID)实验中,独立评估员定期对紊乱预测因子进行评估。最近完成的 CAID2 实验对近 40 种最先进的方法进行了评估,结果表明其中一些方法能得出准确的结果。特别是 flDPnn2 方法,它是在 CAID1 实验中表现出色的 flDPnn 的后继方法,在 CAID2 中的 Disorder-NOX 数据集上获得了总体最准确的结果。与前代方法相比,flDPnn2 实现了一系列改进,包括更改输入、增加深度网络模型的大小(我们在更大的训练集上重新训练了该模型)以及添加配准模块。我们利用 CAID2 的结果表明,flDPnn2 能够非常快速地生成准确的预测结果,比 flDPnn 的准确性略有提高,而且运行时间缩短了一半,每个蛋白质的运行时间约为 27 秒。flDPnn2 作为一个方便的网络服务器免费提供,网址是 http://biomine.cs.vcu.edu/servers/flDPnn2/。
{"title":"flDPnn2: Accurate and Fast Predictor of Intrinsic Disorder in Proteins","authors":"","doi":"10.1016/j.jmb.2024.168605","DOIUrl":"10.1016/j.jmb.2024.168605","url":null,"abstract":"<div><p>Prediction of the intrinsic disorder in protein sequences is an active research area, with well over 100 predictors that were released to date. These efforts are motivated by the functional importance and high levels of abundance of intrinsic disorder, combined with relatively low amounts of experimental annotations. The disorder predictors are periodically evaluated by independent assessors in the Critical Assessment of protein Intrinsic Disorder prediction (CAID) experiments. The recently completed CAID2 experiment assessed close to 40 state-of-the-art methods demonstrating that some of them produce accurate results. In particular, flDPnn2 method, which is the successor of flDPnn that performed well in the CAID1 experiment, secured the overall most accurate results on the Disorder-NOX dataset in CAID2. flDPnn2 implements a number of improvements when compared to its predecessor including changes to the inputs, increased size of the deep network model that we retrained on a larger training set, and addition of an alignment module. Using results from CAID2, we show that flDPnn2 produces accurate predictions very quickly, modestly improving over the accuracy of flDPnn and reducing the runtime by half, to about 27 s per protein. flDPnn2 is freely available as a convenient web server at <span><span>http://biomine.cs.vcu.edu/servers/flDPnn2/</span><svg><path></path></svg></span>.</p></div>","PeriodicalId":369,"journal":{"name":"Journal of Molecular Biology","volume":null,"pages":null},"PeriodicalIF":4.7,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0022283624002006/pdfft?md5=330905e4b9416747921c22b01cd0d82e&pid=1-s2.0-S0022283624002006-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140929176","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
scRNA-Explorer: An End-user Online Tool for Single Cell RNA-seq Data Analysis Featuring Gene Correlation and Data Filtering scRNA-Explorer:用于单细胞 RNA-seq 数据分析的终端用户在线工具,具有基因相关性和数据过滤功能
IF 4.7 2区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2024-09-01 DOI: 10.1016/j.jmb.2024.168654

In the majority of downstream analysis pipelines for single-cell RNA sequencing (scRNA-seq), techniques like dimensionality reduction and feature selection are employed to address the problem of high-dimensional nature of the data. These approaches involve mapping the data onto a lower-dimensional space, eliminating less informative genes, and pinpointing the most pertinent features. This process ultimately leads to a reduction in the number of dimensions used for downstream analysis, which in turn speeds up the computation of large-scale scRNA-seq data. Most approaches are directed to isolate from biological background the genes characterizing different cells and or the condition under study by establishing lists of differentially expressed or coexpressed genes. Herein, we present scRNA-Explorer an open-source online tool for simplified and rapid scRNA-seq analysis designed with the end user in mind. scRNA-Explorer utilizes: (i) Filtering out uninformative cells in an interactive manner via a web interface, (ii) Gene correlation analysis coupled with an extra step of evaluating the biological importance of these correlations, and (iii) Gene enrichment analysis of correlated genes in order to find gene implication in specific functions. We developed a pipeline to address the above problem. The scRNA-Explorer pipeline allows users to interrogate in an interactive manner scRNA-sequencing data sets to explore via gene expression correlations possible function(s) of a gene of interest. scRNA-Explorer can be accessed at https://bioinformatics.med.uoc.gr/shinyapps/app/scrnaexplorer.

在大多数单细胞 RNA 测序(scRNA-seq)下游分析管道中,都采用了降维和特征选择等技术来解决数据的高维性问题。这些方法包括将数据映射到低维空间,剔除信息量较少的基因,并找出最相关的特征。这一过程最终会减少用于下游分析的维数,进而加快大规模 scRNA-seq 数据的计算速度。大多数方法都是通过建立差异表达或共表达基因列表,从生物背景中分离出表征不同细胞或研究条件的基因。scRNA-Explorer 利用:(i) 通过网络界面以交互方式过滤掉无信息的细胞;(ii) 基因相关性分析,并额外评估这些相关性的生物学重要性;(iii) 对相关基因进行基因富集分析,以发现基因对特定功能的影响。我们开发了一个管道来解决上述问题。scRNA-Explorer 管道允许用户以交互方式查询 scRNA 序列数据集,通过基因表达相关性探索相关基因的可能功能。scRNA-Explorer 的访问网址为 https://bioinformatics.med.uoc.gr/shinyapps/app/scrnaexplorer。
{"title":"scRNA-Explorer: An End-user Online Tool for Single Cell RNA-seq Data Analysis Featuring Gene Correlation and Data Filtering","authors":"","doi":"10.1016/j.jmb.2024.168654","DOIUrl":"10.1016/j.jmb.2024.168654","url":null,"abstract":"<div><p>In the majority of downstream analysis pipelines for single-cell RNA sequencing (scRNA-seq), techniques like dimensionality reduction and feature selection are employed to address the problem of high-dimensional nature of the data. These approaches involve mapping the data onto a lower-dimensional space, eliminating less informative genes, and pinpointing the most pertinent features. This process ultimately leads to a reduction in the number of dimensions used for downstream analysis, which in turn speeds up the computation of large-scale scRNA-seq data. Most approaches are directed to isolate from biological background the genes characterizing different cells and or the condition under study by establishing lists of differentially expressed or coexpressed genes. Herein, we present scRNA-Explorer an open-source online tool for simplified and rapid scRNA-seq analysis designed with the end user in mind. scRNA-Explorer utilizes: (i) Filtering out uninformative cells in an interactive manner via a web interface, (ii) Gene correlation analysis coupled with an extra step of evaluating the biological importance of these correlations, and (iii) Gene enrichment analysis of correlated genes in order to find gene implication in specific functions. We developed a pipeline to address the above problem. The scRNA-Explorer pipeline allows users to interrogate in an interactive manner scRNA-sequencing data sets to explore via gene expression correlations possible function(s) of a gene of interest. scRNA-Explorer can be accessed at <span><span>https://bioinformatics.med.uoc.gr/shinyapps/app/scrnaexplorer</span><svg><path></path></svg></span>.</p></div>","PeriodicalId":369,"journal":{"name":"Journal of Molecular Biology","volume":null,"pages":null},"PeriodicalIF":4.7,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0022283624002493/pdfft?md5=ac12f19f4529bc7cd7b91b26f0ebf3e8&pid=1-s2.0-S0022283624002493-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141390393","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
xiVIEW: Visualisation of Crosslinking Mass Spectrometry Data xiVIEW:交联质谱数据可视化
IF 4.7 2区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2024-09-01 DOI: 10.1016/j.jmb.2024.168656

Crosslinking mass spectrometry (MS) has emerged as an important technique for elucidating the in-solution structures of protein complexes and the topology of protein–protein interaction networks. However, the expanding user community lacked an integrated visualisation tool that helped them make use of the crosslinking data for investigating biological mechanisms. We addressed this need by developing xiVIEW, a web-based application designed to streamline crosslinking MS data analysis, which we present here. xiVIEW provides a user-friendly interface for accessing coordinated views of mass spectrometric data, network visualisation, annotations extracted from trusted repositories like UniProtKB, and available 3D structures. In accordance with recent recommendations from the crosslinking MS community, xiVIEW (i) provides a standards compliant parser to improve data integration and (ii) offers accessible visualisation tools. By promoting the adoption of standard file formats and providing a comprehensive visualisation platform, xiVIEW empowers both experimentalists and modellers alike to pursue their respective research interests. We anticipate that xiVIEW will advance crosslinking MS-inspired research, and facilitate broader and more effective investigations into complex biological systems.

交联质谱(MS)已成为阐明蛋白质复合物溶液内结构和蛋白质-蛋白质相互作用网络拓扑结构的重要技术。然而,不断扩大的用户群体缺乏一种综合的可视化工具来帮助他们利用交联数据研究生物机制。为了满足这一需求,我们开发了基于网络的应用程序 xiVIEW,旨在简化交联质谱数据分析。xiVIEW 提供了一个友好的用户界面,用于访问质谱数据的协调视图、网络可视化、从 UniProtKB 等可信资源库提取的注释以及可用的三维结构。根据交联质谱社区最近提出的建议,xiVIEW (i) 提供了一个符合标准的解析器,以改进数据整合;(ii) 提供了可访问的可视化工具。通过促进标准文件格式的采用和提供全面的可视化平台,xiVIEW 使实验人员和建模人员都能追求各自的研究兴趣。我们预计 xiVIEW 将推动交联质谱启发的研究,促进对复杂生物系统进行更广泛、更有效的研究。
{"title":"xiVIEW: Visualisation of Crosslinking Mass Spectrometry Data","authors":"","doi":"10.1016/j.jmb.2024.168656","DOIUrl":"10.1016/j.jmb.2024.168656","url":null,"abstract":"<div><p>Crosslinking mass spectrometry (MS) has emerged as an important technique for elucidating the in-solution structures of protein complexes and the topology of protein–protein interaction networks. However, the expanding user community lacked an integrated visualisation tool that helped them make use of the crosslinking data for investigating biological mechanisms. We addressed this need by developing xiVIEW, a web-based application designed to streamline crosslinking MS data analysis, which we present here. xiVIEW provides a user-friendly interface for accessing coordinated views of mass spectrometric data, network visualisation, annotations extracted from trusted repositories like UniProtKB, and available 3D structures. In accordance with recent recommendations from the crosslinking MS community, xiVIEW (i) provides a standards compliant parser to improve data integration and (ii) offers accessible visualisation tools. By promoting the adoption of standard file formats and providing a comprehensive visualisation platform, xiVIEW empowers both experimentalists and modellers alike to pursue their respective research interests. We anticipate that xiVIEW will advance crosslinking MS-inspired research, and facilitate broader and more effective investigations into complex biological systems.</p></div>","PeriodicalId":369,"journal":{"name":"Journal of Molecular Biology","volume":null,"pages":null},"PeriodicalIF":4.7,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0022283624002511/pdfft?md5=acab584de04c1897e54dfc9d1552c268&pid=1-s2.0-S0022283624002511-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141530289","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Modification of Regulatory Tyrosine Residues Biases Human Hsp90α in its Interactions with Cochaperones and Clients 调节性酪氨酸残基的修饰会使人类 Hsp90α 在与辅助伴侣和客户的相互作用中产生偏差。
IF 4.7 2区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2024-09-01 DOI: 10.1016/j.jmb.2024.168772
Yuantao Huo , Rishabh Karnawat , Lixia Liu , Robert A. Knieß , Maike Groß , Xuemei Chen , Matthias P. Mayer

The highly conserved Hsp90 chaperones control stability and activity of many essential signaling and regulatory proteins including many protein kinases, E3 ligases and transcription factors. Thereby, Hsp90s couple cellular homeostasis of the proteome to cell fate decisions. High-throughput mass spectrometry revealed 178 and 169 posttranslational modifications (PTMs) for human cytosolic Hsp90α and Hsp90β, but for only a few of the modifications the physiological consequences are investigated in some detail. In this study, we explored the suitability of the yeast model system for the identification of key regulatory residues in human Hsp90α. Replacement of three tyrosine residues known to be phosphorylated by phosphomimetic glutamate and by non-phosphorylatable phenylalanine individually and in combination influenced yeast growth and the maturation of 7 different Hsp90 clients in distinct ways. Furthermore, wild-type and mutant Hsp90 differed in their ability to stabilize known clients when expressed in HepG2 HSP90AA1−/− cells. The purified mutant proteins differed in their interaction with the cochaperones Aha1, Cdc37, Hop and p23 and in their support of the maturation of glucocorticoid receptor ligand binding domain in vitro. In vivo and in vitro data correspond well to each other confirming that the yeast system is suitable for the identification of key regulatory sites in human Hsp90s. Our findings indicate that even closely related clients are affected differently by the amino acid replacements in the investigated positions, suggesting that PTMs could bias Hsp90s client specificity.

高度保守的 Hsp90 合子控制着许多重要信号转导和调节蛋白的稳定性和活性,包括许多蛋白激酶、E3 连接酶和转录因子。因此,Hsp90 将蛋白质组的细胞平衡与细胞命运的决定联系在一起。高通量质谱分析揭示了人类细胞质 Hsp90α 和 Hsp90β 的 178 和 169 种翻译后修饰(PTM),但只有少数修饰的生理后果得到了较详细的研究。在这项研究中,我们探索了酵母模型系统是否适合用于鉴定人类 Hsp90α 中的关键调控残基。用拟磷酸化谷氨酸和非磷酸化苯丙氨酸单独或联合取代三个已知被磷酸化的酪氨酸残基,会以不同的方式影响酵母的生长和 7 种不同 Hsp90 客户的成熟。此外,在 HepG2 HSP90AA1-/- 细胞中表达时,野生型和突变型 Hsp90 稳定已知客户的能力各不相同。纯化的突变体蛋白在与辅助伴侣Aha1、Cdc37、Hop和p23的相互作用以及在体外支持糖皮质激素受体配体结合域的成熟方面存在差异。体内和体外数据相互吻合,证实酵母系统适用于鉴定人类 Hsp90s 的关键调控位点。我们的研究结果表明,即使是密切相关的客户也会受到所研究位置上氨基酸替换的不同影响,这表明 PTMs 可能会影响 Hsp90 的客户特异性。
{"title":"Modification of Regulatory Tyrosine Residues Biases Human Hsp90α in its Interactions with Cochaperones and Clients","authors":"Yuantao Huo ,&nbsp;Rishabh Karnawat ,&nbsp;Lixia Liu ,&nbsp;Robert A. Knieß ,&nbsp;Maike Groß ,&nbsp;Xuemei Chen ,&nbsp;Matthias P. Mayer","doi":"10.1016/j.jmb.2024.168772","DOIUrl":"10.1016/j.jmb.2024.168772","url":null,"abstract":"<div><p>The highly conserved Hsp90 chaperones control stability and activity of many essential signaling and regulatory proteins including many protein kinases, E3 ligases and transcription factors. Thereby, Hsp90s couple cellular homeostasis of the proteome to cell fate decisions. High-throughput mass spectrometry revealed 178 and 169 posttranslational modifications (PTMs) for human cytosolic Hsp90α and Hsp90β, but for only a few of the modifications the physiological consequences are investigated in some detail. In this study, we explored the suitability of the yeast model system for the identification of key regulatory residues in human Hsp90α. Replacement of three tyrosine residues known to be phosphorylated by phosphomimetic glutamate and by non-phosphorylatable phenylalanine individually and in combination influenced yeast growth and the maturation of 7 different Hsp90 clients in distinct ways. Furthermore, wild-type and mutant Hsp90 differed in their ability to stabilize known clients when expressed in HepG2 <em>HSP90AA1</em><sup>−/−</sup> cells. The purified mutant proteins differed in their interaction with the cochaperones Aha1, Cdc37, Hop and p23 and in their support of the maturation of glucocorticoid receptor ligand binding domain <em>in vitro</em>. <em>In vivo</em> and <em>in vitro</em> data correspond well to each other confirming that the yeast system is suitable for the identification of key regulatory sites in human Hsp90s. Our findings indicate that even closely related clients are affected differently by the amino acid replacements in the investigated positions, suggesting that PTMs could bias Hsp90s client specificity.</p></div>","PeriodicalId":369,"journal":{"name":"Journal of Molecular Biology","volume":null,"pages":null},"PeriodicalIF":4.7,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0022283624003929/pdfft?md5=69c81d021431476f9dc084f49c84518e&pid=1-s2.0-S0022283624003929-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142118639","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An Atypical Mechanism of SUMOylation of Neurofibromin SecPH Domain Provides New Insights into SUMOylation Site Selection 神经纤维瘤蛋白 SecPH 结构域的非典型 SUMO 化机制为 SUMO 化位点选择提供了新的视角。
IF 4.7 2区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2024-08-30 DOI: 10.1016/j.jmb.2024.168768
Mohammed Bergoug , Christine Mosrin , Amandine Serrano, Fabienne Godin, Michel Doudeau, Iva Dundović, Stephane Goffinont, Thierry Normand, Marcin J. Suskiewicz, Béatrice Vallée, Hélène Bénédetti

Neurofibromin (Nf1) is a giant multidomain protein encoded by the tumour-suppressor gene NF1. NF1 is mutated in a common genetic disease, neurofibromatosis type I (NF1), and in various cancers. The protein has a Ras-GAP (GTPase activating protein) activity but is also connected to diverse signalling pathways through its SecPH domain, which interacts with lipids and different protein partners. We previously showed that Nf1 partially colocalized with the ProMyelocytic Leukemia (PML) protein in PML nuclear bodies, hotspots of SUMOylation, thereby suggesting the potential SUMOylation of Nf1. Here, we demonstrate that the full-length isoform 2 and a SecPH fragment of Nf1 are substrates of the SUMO pathway and identify a well-defined SUMOylation profile of SecPH with two main modified lysines. One of these sites, K1731, is highly conserved and surface-exposed. Despite the presence of an inverted SUMO consensus motif surrounding K1731, and a potential SUMO-interacting motif (SIM) within SecPH, we show that neither of these elements is necessary for K1731 SUMOylation, which is also independent of Ubc9 SUMOylation on K14. A 3D model of an interaction between SecPH and Ubc9 centred on K1731, combined with site-directed mutagenesis, identifies specific structural elements of SecPH required for K1731 SUMOylation, some of which are affected in reported NF1 pathogenic variants. This work provides a new example of SUMOylation dependent on the tertiary rather than primary protein structure surrounding the modified site, expanding our knowledge of mechanisms governing SUMOylation site selection.

神经纤维瘤蛋白(Nf1)是一种由肿瘤抑制基因 NF1 编码的巨型多域蛋白。NF1 基因突变可导致一种常见的遗传疾病--I 型神经纤维瘤病 (NF1),也可导致多种癌症。该蛋白具有 Ras-GAP(GTPase activating protein,GTPase 激活蛋白)活性,但也通过其 SecPH 结构域与不同的信号通路相连,该结构域可与脂质和不同的蛋白伙伴相互作用。我们以前曾发现,Nf1 与 ProMyelocytic Leukemia(PML)蛋白部分共定位在 PML 核体(SUMOylation 的热点)中,从而表明 Nf1 可能存在 SUMOylation。 在这里,我们证明了 Nf1 的全长异构体 2 和 SecPH 片段是 SUMO 通路的底物,并确定了 SecPH 明确的 SUMOylation 特征,其中有两个主要修饰的赖氨酸。其中一个位点 K1731 是高度保守的表面暴露位点。尽管在 K1731 周围存在一个倒置的 SUMO 共识基序,而且在 SecPH 中存在一个潜在的 SUMO 相互作用基序 (SIM),但我们发现这两个元素都不是 K1731 SUMO 化所必需的,而且 K1731 的 SUMO 化也与 K14 上的 Ubc9 SUMO 化无关。以 K1731 为中心的 SecPH 与 Ubc9 之间相互作用的三维模型,结合定点突变,确定了 K1731 SUMOylation 所需的 SecPH 的特定结构元素,其中一些元素在已报道的 NF1 致病变体中受到了影响。这项工作提供了一个新的例子,说明SUMO酰化依赖于修饰位点周围的三级而非一级蛋白质结构,从而扩展了我们对SUMO酰化位点选择机制的认识。
{"title":"An Atypical Mechanism of SUMOylation of Neurofibromin SecPH Domain Provides New Insights into SUMOylation Site Selection","authors":"Mohammed Bergoug ,&nbsp;Christine Mosrin ,&nbsp;Amandine Serrano,&nbsp;Fabienne Godin,&nbsp;Michel Doudeau,&nbsp;Iva Dundović,&nbsp;Stephane Goffinont,&nbsp;Thierry Normand,&nbsp;Marcin J. Suskiewicz,&nbsp;Béatrice Vallée,&nbsp;Hélène Bénédetti","doi":"10.1016/j.jmb.2024.168768","DOIUrl":"10.1016/j.jmb.2024.168768","url":null,"abstract":"<div><p>Neurofibromin (Nf1) is a giant multidomain protein encoded by the tumour-suppressor gene <em>NF1</em>. <em>NF1</em> is mutated in a common genetic disease, neurofibromatosis type I (NF1), and in various cancers. The protein has a Ras-GAP (GTPase activating protein) activity but is also connected to diverse signalling pathways through its SecPH domain, which interacts with lipids and different protein partners. We previously showed that Nf1 partially colocalized with the ProMyelocytic Leukemia (PML) protein in PML nuclear bodies, hotspots of SUMOylation, thereby suggesting the potential SUMOylation of Nf1. Here, we demonstrate that the full-length isoform 2 and a SecPH fragment of Nf1 are substrates of the SUMO pathway and identify a well-defined SUMOylation profile of SecPH with two main modified lysines. One of these sites, K1731, is highly conserved and surface-exposed. Despite the presence of an inverted SUMO consensus motif surrounding K1731, and a potential SUMO-interacting motif (SIM) within SecPH, we show that neither of these elements is necessary for K1731 SUMOylation, which is also independent of Ubc9 SUMOylation on K14. A 3D model of an interaction between SecPH and Ubc9 centred on K1731, combined with site-directed mutagenesis, identifies specific structural elements of SecPH required for K1731 SUMOylation, some of which are affected in reported <em>NF1</em> pathogenic variants. This work provides a new example of SUMOylation dependent on the tertiary rather than primary protein structure surrounding the modified site, expanding our knowledge of mechanisms governing SUMOylation site selection.</p></div>","PeriodicalId":369,"journal":{"name":"Journal of Molecular Biology","volume":null,"pages":null},"PeriodicalIF":4.7,"publicationDate":"2024-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0022283624003887/pdfft?md5=9ede31663fa2a8fcc52d7dc5e454c0f0&pid=1-s2.0-S0022283624003887-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142102998","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Cotranscriptional Folding of a 5′ Stem-loop in the Escherichia coli tbpA Riboswitch at Single-nucleotide Resolution 以单核苷酸分辨率观察大肠杆菌 tbpA 核糖开关中 5-́ 干环的同源折叠。
IF 4.7 2区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2024-08-30 DOI: 10.1016/j.jmb.2024.168771
Elsa D.M. Hien , Patrick St-Pierre , J. Carlos Penedo , Daniel A. Lafontaine

Transcription elongation is one of the most important processes in the cell. During RNA polymerase elongation, the folding of nascent transcripts plays crucial roles in the genetic decision. Bacterial riboswitches are prime examples of RNA regulators that control gene expression by altering their structure upon metabolite sensing. It was previously revealed that the thiamin pyrophosphate-sensing tbpA riboswitch in Escherichia coli cotranscriptionally adopts three main structures leading to metabolite sensing. Here, using single-molecule FRET, we characterize the transition in which the first nascent structure, a 5′ stem-loop, is unfolded during transcription elongation to form the ligand-binding competent structure. Our results suggest that the structural transition occurs in a relatively abrupt manner, i.e., within a 1–2 nucleotide window. Furthermore, a highly dynamic structural exchange is observed, indicating that riboswitch transcripts perform rapid sampling of nascent co-occurring structures. We also observe that the presence of the RNAP stabilizes the 5′ stem-loop along the elongation process, consistent with RNAP interacting with the 5′ stem-loop. Our study emphasizes the role of early folding stem-loop structures in the cotranscriptional formation of complex RNA molecules involved in genetic regulation.

转录延伸是细胞中最重要的过程之一。在 RNA 聚合酶延伸过程中,新生转录本的折叠在基因决定中起着至关重要的作用。细菌核糖开关是 RNA 调节器的典型例子,它们通过改变代谢物感应时的结构来控制基因表达。之前的研究发现,大肠杆菌中的焦磷酸硫胺素感应 tbpA 核糖开关在同转录过程中会采用三种主要结构来实现代谢物感应。在这里,我们利用单分子 FRET 分析了第一个新生结构(5-́茎环)在转录延伸过程中展开形成配体结合合格结构的转变过程。我们的研究结果表明,结构转换以相对突然的方式发生,即在 1-2 个核苷酸窗口内。此外,我们还观察到了高度动态的结构交换,这表明核糖开关转录本对新生共存结构进行了快速取样。我们还观察到,在延伸过程中,RNAP的存在稳定了5́茎环,这与RNAP与5́茎环相互作用是一致的。我们的研究强调了早期折叠茎环结构在参与遗传调控的复杂 RNA 分子的共转录形成过程中的作用。
{"title":"Cotranscriptional Folding of a 5′ Stem-loop in the Escherichia coli tbpA Riboswitch at Single-nucleotide Resolution","authors":"Elsa D.M. Hien ,&nbsp;Patrick St-Pierre ,&nbsp;J. Carlos Penedo ,&nbsp;Daniel A. Lafontaine","doi":"10.1016/j.jmb.2024.168771","DOIUrl":"10.1016/j.jmb.2024.168771","url":null,"abstract":"<div><p>Transcription elongation is one of the most important processes in the cell. During RNA polymerase elongation, the folding of nascent transcripts plays crucial roles in the genetic decision. Bacterial riboswitches are prime examples of RNA regulators that control gene expression by altering their structure upon metabolite sensing. It was previously revealed that the thiamin pyrophosphate-sensing <em>tbpA</em> riboswitch in <em>Escherichia coli</em> cotranscriptionally adopts three main structures leading to metabolite sensing. Here, using single-molecule FRET, we characterize the transition in which the first nascent structure, a 5′ stem-loop, is unfolded during transcription elongation to form the ligand-binding competent structure. Our results suggest that the structural transition occurs in a relatively abrupt manner, <em>i.e.</em>, within a 1–2 nucleotide window. Furthermore, a highly dynamic structural exchange is observed, indicating that riboswitch transcripts perform rapid sampling of nascent co-occurring structures. We also observe that the presence of the RNAP stabilizes the 5′ stem-loop along the elongation process, consistent with RNAP interacting with the 5′ stem-loop. Our study emphasizes the role of early folding stem-loop structures in the cotranscriptional formation of complex RNA molecules involved in genetic regulation.</p></div>","PeriodicalId":369,"journal":{"name":"Journal of Molecular Biology","volume":null,"pages":null},"PeriodicalIF":4.7,"publicationDate":"2024-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0022283624003917/pdfft?md5=4b592b408e4fe1c2e33ab0984a7938a4&pid=1-s2.0-S0022283624003917-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142102999","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Deciphering the Language of Protein-DNA Interactions: A Deep Learning Approach Combining Contextual Embeddings and Multi-Scale Sequence Modeling 破译蛋白质-DNA相互作用的语言:结合上下文嵌入和多尺度序列建模的深度学习方法
IF 4.7 2区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2024-08-29 DOI: 10.1016/j.jmb.2024.168769
Yu-Chen Liu , Yi-Jing Lin , Yan-Yun Chang , Cheng-Che Chuang , Yu-Yen Ou

Deciphering the mechanisms governing protein-DNA interactions is crucial for understanding key cellular processes and disease pathways. In this work, we present a powerful deep learning approach that significantly advances the computational prediction of DNA-interacting residues from protein sequences.

Our method leverages the rich contextual representations learned by pre-trained protein language models, such as ProtTrans, to capture intrinsic biochemical properties and sequence motifs indicative of DNA binding sites. We then integrate these contextual embeddings with a multi-window convolutional neural network architecture, which scans across the sequence at varying window sizes to effectively identify both local and global binding patterns.

Comprehensive evaluation on curated benchmark datasets demonstrates the remarkable performance of our approach, achieving an area under the ROC curve (AUC) of 0.89 – a substantial improvement over previous state-of-the-art sequence-based predictors. This showcases the immense potential of pairing advanced representation learning and deep neural network designs for uncovering the complex syntax governing protein-DNA interactions directly from primary sequences.

Our work not only provides a robust computational tool for characterizing DNA-binding mechanisms, but also highlights the transformative opportunities at the intersection of language modeling, deep learning, and protein sequence analysis. The publicly available code and data further facilitate broader adoption and continued development of these techniques for accelerating mechanistic insights into vital biological processes and disease pathways.

In addition, the code and data for this work are available at https://github.com/B1607/DIRP.

破译蛋白质与 DNA 的相互作用机制对于理解关键的细胞过程和疾病途径至关重要。在这项工作中,我们提出了一种强大的深度学习方法,大大推进了对蛋白质序列中 DNA 相互作用残基的计算预测。我们的方法利用了预先训练的蛋白质语言模型(如 ProtTrans)所学习到的丰富上下文表征,以捕捉表明 DNA 结合位点的内在生化特性和序列图案。然后,我们将这些上下文嵌入与多窗口卷积神经网络架构相结合,该架构以不同的窗口大小扫描整个序列,从而有效识别局部和全局结合模式。在经过策划的基准数据集上进行的综合评估表明,我们的方法性能卓越,ROC 曲线下面积(AUC)达到了 0.89,比以前最先进的基于序列的预测方法有了大幅提高。这展示了先进的表示学习和深度神经网络设计在直接从主序列揭示支配蛋白质-DNA 相互作用的复杂语法方面的巨大潜力。我们的工作不仅为表征 DNA 结合机制提供了强大的计算工具,还凸显了语言建模、深度学习和蛋白质序列分析交叉领域的变革机遇。公开的代码和数据进一步促进了这些技术的广泛应用和持续发展,加快了对重要生物过程和疾病途径的机理认识。此外,这项工作的代码和数据可在 https://github.com/B1607/DIRP 上获取。
{"title":"Deciphering the Language of Protein-DNA Interactions: A Deep Learning Approach Combining Contextual Embeddings and Multi-Scale Sequence Modeling","authors":"Yu-Chen Liu ,&nbsp;Yi-Jing Lin ,&nbsp;Yan-Yun Chang ,&nbsp;Cheng-Che Chuang ,&nbsp;Yu-Yen Ou","doi":"10.1016/j.jmb.2024.168769","DOIUrl":"10.1016/j.jmb.2024.168769","url":null,"abstract":"<div><p>Deciphering the mechanisms governing protein-DNA interactions is crucial for understanding key cellular processes and disease pathways. In this work, we present a powerful deep learning approach that significantly advances the computational prediction of DNA-interacting residues from protein sequences.</p><p>Our method leverages the rich contextual representations learned by pre-trained protein language models, such as ProtTrans, to capture intrinsic biochemical properties and sequence motifs indicative of DNA binding sites. We then integrate these contextual embeddings with a multi-window convolutional neural network architecture, which scans across the sequence at varying window sizes to effectively identify both local and global binding patterns.</p><p>Comprehensive evaluation on curated benchmark datasets demonstrates the remarkable performance of our approach, achieving an area under the ROC curve (AUC) of 0.89 – a substantial improvement over previous state-of-the-art sequence-based predictors. This showcases the immense potential of pairing advanced representation learning and deep neural network designs for uncovering the complex syntax governing protein-DNA interactions directly from primary sequences.</p><p>Our work not only provides a robust computational tool for characterizing DNA-binding mechanisms, but also highlights the transformative opportunities at the intersection of language modeling, deep learning, and protein sequence analysis. The publicly available code and data further facilitate broader adoption and continued development of these techniques for accelerating mechanistic insights into vital biological processes and disease pathways.</p><p>In addition, the code and data for this work are available at <span><span>https://github.com/B1607/DIRP</span><svg><path></path></svg></span>.</p></div>","PeriodicalId":369,"journal":{"name":"Journal of Molecular Biology","volume":null,"pages":null},"PeriodicalIF":4.7,"publicationDate":"2024-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0022283624003899/pdfft?md5=48e8a1f78b82ff4e5d3d37956f6b0f26&pid=1-s2.0-S0022283624003899-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142103000","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Journal of Molecular Biology
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1