首页 > 最新文献

Briefings in bioinformatics最新文献

英文 中文
Predictive modelling of acute Promyelocytic leukaemia resistance to retinoic acid therapy. 急性早幼粒细胞白血病对维甲酸治疗耐药的预测模型。
IF 6.8 2区 生物学 Q1 BIOCHEMICAL RESEARCH METHODS Pub Date : 2024-11-22 DOI: 10.1093/bib/bbaf002
José A Sánchez-Villanueva, Lia N'Guyen, Mathilde Poplineau, Estelle Duprez, Élisabeth Remy, Denis Thieffry

Acute Promyelocytic Leukaemia (APL) arises from an aberrant chromosomal translocation involving the Retinoic Acid Receptor Alpha (RARA) gene, predominantly with the Promyelocytic Leukaemia (PML) or Promyelocytic Leukaemia Zinc Finger (PLZF) genes. The resulting oncoproteins block the haematopoietic differentiation program promoting aberrant proliferative promyelocytes. Retinoic Acid (RA) therapy is successful in most of the PML::RARA patients, while PLZF::RARA patients frequently become resistant and relapse. Recent studies pointed to various underlying molecular components, but their precise contributions remain to be deciphered. We developed a logical network model integrating signalling, transcriptional, and epigenetic regulatory mechanisms, which captures key features of the APL cell responses to RA depending on the genetic background. The explicit inclusion of the histone methyltransferase EZH2 allowed the assessment of its role in the resistance mechanism, distinguishing between its canonical and non-canonical activities. The model dynamics was thoroughly analysed using tools integrated in the public software suite maintained by the CoLoMoTo consortium (https://colomoto.github.io/). The model serves as a solid basis to assess the roles of novel regulatory mechanisms, as well as to explore novel therapeutical approaches in silico.

急性早幼粒细胞白血病(APL)是由涉及视黄酸受体α (RARA)基因的染色体异常易位引起的,主要与早幼粒细胞白血病(PML)或早幼粒细胞白血病锌指(PLZF)基因有关。由此产生的癌蛋白阻断造血分化程序,促进异常增殖早幼粒细胞。视黄酸(RA)治疗在大多数PML::RARA患者中是成功的,而PLZF::RARA患者经常产生耐药性和复发。最近的研究指出了各种潜在的分子成分,但它们的确切作用仍有待破译。我们开发了一个整合信号、转录和表观遗传调控机制的逻辑网络模型,该模型捕捉了APL细胞对RA的遗传背景反应的关键特征。组蛋白甲基转移酶EZH2的明确包含允许评估其在耐药机制中的作用,区分其典型和非典型活性。使用集成在CoLoMoTo财团(https://colomoto.github.io/)维护的公共软件套件中的工具对模型动力学进行了彻底的分析。该模型为评估新的调节机制的作用以及探索新的计算机治疗方法提供了坚实的基础。
{"title":"Predictive modelling of acute Promyelocytic leukaemia resistance to retinoic acid therapy.","authors":"José A Sánchez-Villanueva, Lia N'Guyen, Mathilde Poplineau, Estelle Duprez, Élisabeth Remy, Denis Thieffry","doi":"10.1093/bib/bbaf002","DOIUrl":"10.1093/bib/bbaf002","url":null,"abstract":"<p><p>Acute Promyelocytic Leukaemia (APL) arises from an aberrant chromosomal translocation involving the Retinoic Acid Receptor Alpha (RARA) gene, predominantly with the Promyelocytic Leukaemia (PML) or Promyelocytic Leukaemia Zinc Finger (PLZF) genes. The resulting oncoproteins block the haematopoietic differentiation program promoting aberrant proliferative promyelocytes. Retinoic Acid (RA) therapy is successful in most of the PML::RARA patients, while PLZF::RARA patients frequently become resistant and relapse. Recent studies pointed to various underlying molecular components, but their precise contributions remain to be deciphered. We developed a logical network model integrating signalling, transcriptional, and epigenetic regulatory mechanisms, which captures key features of the APL cell responses to RA depending on the genetic background. The explicit inclusion of the histone methyltransferase EZH2 allowed the assessment of its role in the resistance mechanism, distinguishing between its canonical and non-canonical activities. The model dynamics was thoroughly analysed using tools integrated in the public software suite maintained by the CoLoMoTo consortium (https://colomoto.github.io/). The model serves as a solid basis to assess the roles of novel regulatory mechanisms, as well as to explore novel therapeutical approaches in silico.</p>","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"26 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11729720/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142977611","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
GPSD: a hybrid learning framework for the prediction of phosphatase-specific dephosphorylation sites. GPSD:预测磷酸酶特异性去磷酸化位点的混合学习框架。
IF 6.8 2区 生物学 Q1 BIOCHEMICAL RESEARCH METHODS Pub Date : 2024-11-22 DOI: 10.1093/bib/bbae694
Cheng Han, Shanshan Fu, Miaomiao Chen, Yujie Gou, Dan Liu, Chi Zhang, Xinhe Huang, Leming Xiao, Miaoying Zhao, Jiayi Zhang, Qiang Xiao, Di Peng, Yu Xue

Protein phosphorylation is dynamically and reversibly regulated by protein kinases and protein phosphatases, and plays an essential role in orchestrating a wide range of biological processes. Although a number of tools have been developed for predicting kinase-specific phosphorylation sites (p-sites), computational prediction of phosphatase-specific dephosphorylation sites remains to be a great challenge. In this study, we manually curated 4393 experimentally identified site-specific phosphatase-substrate relationships for 3463 dephosphorylation sites occurring on phosphoserine, phosphothreonine, and/or phosphotyrosine residues, from the literature and public databases. Then, we developed a hybrid learning framework, the group-based prediction system for the prediction of phosphatase-specific dephosphorylation sites (GPSD). For model training, we integrated 10 types of sequence features and utilized three types of machine learning methods, including penalized logistic regression, deep neural networks, and transformer neural networks. First, a pretrained model was constructed using 561 416 nonredundant p-sites and then fine-tuned to generate computational models for predicting general dephosphorylation sites. In addition, 103 individual phosphatase-specific predictors were constructed via transfer learning and meta-learning. For site prediction, one or multiple protein sequences in FASTA format could be inputted, and the prediction results will be shown together with additional annotations, such as protein-protein interactions, structural information, and disorder propensity. The online service of GPSD is freely available at https://gpsd.biocuckoo.cn/. We believe that GPSD can serve as a valuable tool for further analysis of dephosphorylation.

蛋白磷酸化受蛋白激酶和蛋白磷酸酶的动态可逆调控,在多种生物过程中起着重要作用。尽管已经开发了许多工具来预测激酶特异性磷酸化位点(p位点),但计算预测磷酸酶特异性去磷酸化位点仍然是一个巨大的挑战。在这项研究中,我们从文献和公共数据库中手动筛选了4393个位点特异性磷酸酶-底物关系,实验鉴定了3463个发生在磷丝氨酸、磷苏氨酸和/或磷酪氨酸残基上的去磷酸化位点。然后,我们开发了一个混合学习框架,即基于组的预测系统,用于预测磷酸酶特异性去磷酸化位点(GPSD)。对于模型训练,我们整合了10种类型的序列特征,并使用了三种类型的机器学习方法,包括惩罚逻辑回归,深度神经网络和变压器神经网络。首先,使用561 416个非冗余p位点构建预训练模型,然后进行微调以生成预测一般去磷酸化位点的计算模型。此外,通过迁移学习和元学习构建了103个个体磷酸酶特异性预测因子。对于位点预测,可以输入FASTA格式的一个或多个蛋白质序列,并将预测结果与蛋白质-蛋白质相互作用、结构信息、无序倾向等附加注释一起显示。政府服务署的网上服务可于https://gpsd.biocuckoo.cn/免费提供。我们相信GPSD可以作为进一步分析去磷酸化的有价值的工具。
{"title":"GPSD: a hybrid learning framework for the prediction of phosphatase-specific dephosphorylation sites.","authors":"Cheng Han, Shanshan Fu, Miaomiao Chen, Yujie Gou, Dan Liu, Chi Zhang, Xinhe Huang, Leming Xiao, Miaoying Zhao, Jiayi Zhang, Qiang Xiao, Di Peng, Yu Xue","doi":"10.1093/bib/bbae694","DOIUrl":"10.1093/bib/bbae694","url":null,"abstract":"<p><p>Protein phosphorylation is dynamically and reversibly regulated by protein kinases and protein phosphatases, and plays an essential role in orchestrating a wide range of biological processes. Although a number of tools have been developed for predicting kinase-specific phosphorylation sites (p-sites), computational prediction of phosphatase-specific dephosphorylation sites remains to be a great challenge. In this study, we manually curated 4393 experimentally identified site-specific phosphatase-substrate relationships for 3463 dephosphorylation sites occurring on phosphoserine, phosphothreonine, and/or phosphotyrosine residues, from the literature and public databases. Then, we developed a hybrid learning framework, the group-based prediction system for the prediction of phosphatase-specific dephosphorylation sites (GPSD). For model training, we integrated 10 types of sequence features and utilized three types of machine learning methods, including penalized logistic regression, deep neural networks, and transformer neural networks. First, a pretrained model was constructed using 561 416 nonredundant p-sites and then fine-tuned to generate computational models for predicting general dephosphorylation sites. In addition, 103 individual phosphatase-specific predictors were constructed via transfer learning and meta-learning. For site prediction, one or multiple protein sequences in FASTA format could be inputted, and the prediction results will be shown together with additional annotations, such as protein-protein interactions, structural information, and disorder propensity. The online service of GPSD is freely available at https://gpsd.biocuckoo.cn/. We believe that GPSD can serve as a valuable tool for further analysis of dephosphorylation.</p>","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"26 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11695897/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142920865","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A novel framework for phage-host prediction via logical probability theory and network sparsification. 基于逻辑概率论和网络稀疏化的噬菌体-宿主预测新框架。
IF 6.8 2区 生物学 Q1 BIOCHEMICAL RESEARCH METHODS Pub Date : 2024-11-22 DOI: 10.1093/bib/bbae708
Ankang Wei, Huanghan Zhan, Zhen Xiao, Weizhong Zhao, Xingpeng Jiang

Bacterial resistance has emerged as one of the greatest threats to human health, and phages have shown tremendous potential in addressing the issue of drug-resistant bacteria by lysing host. The identification of phage-host interactions (PHI) is crucial for addressing bacterial infections. Some existing computational methods for predicting PHI are suboptimal in terms of prediction efficiency due to the limited types of available information. Despite the emergence of some supporting information, the generalizability of models using this information is limited by the small scale of the databases. Additionally, most existing models overlook the sparsity of association data, which severely impacts their predictive performance as well. In this study, we propose a dual-view sparse network model (DSPHI) to predict PHI, which leverages logical probability theory and network sparsification. Specifically, we first constructed similarity networks using the sequences of phages and hosts respectively, and then sparsified these networks, enabling the model to focus more on key information during the learning process, thereby improving prediction efficiency. Next, we utilize logical probability theory to compute high-order logical information between phages (hosts), which is known as mutual information. Subsequently, we connect this information in node form to the sparse phage (host) similarity network, resulting in a phage (host) heterogeneous network that better integrates the two information views, thereby reducing the complexity of model computation and enhancing information aggregation capabilities. The hidden features of phages and hosts are explored through graph learning algorithms. Experimental results demonstrate that mutual information is effective information in predicting PHI, and the sparsification procedure of similarity networks significantly improves the model's predictive performance.

细菌耐药性已成为人类健康的最大威胁之一,噬菌体通过裂解宿主来解决耐药细菌问题显示出巨大的潜力。噬菌体-宿主相互作用(PHI)的鉴定对于解决细菌感染至关重要。由于可用信息的类型有限,现有的一些预测PHI的计算方法在预测效率方面是次优的。尽管出现了一些支持信息,但使用这些信息的模型的泛化性受到数据库规模小的限制。此外,大多数现有模型忽略了关联数据的稀疏性,这也严重影响了它们的预测性能。在本研究中,我们提出了一个双视图稀疏网络模型(DSPHI)来预测PHI,该模型利用逻辑概率论和网络稀疏化。具体而言,我们首先分别使用噬菌体和宿主的序列构建相似网络,然后对这些网络进行稀疏化,使模型在学习过程中更加关注关键信息,从而提高预测效率。接下来,我们利用逻辑概率论计算噬菌体(宿主)之间的高阶逻辑信息,即互信息。随后,我们将这些信息以节点形式连接到稀疏的噬菌体(宿主)相似网络中,形成一个噬菌体(宿主)异构网络,更好地融合了两种信息视图,从而降低了模型计算的复杂性,增强了信息聚合能力。通过图形学习算法探索噬菌体和宿主的隐藏特征。实验结果表明,互信息是预测PHI的有效信息,相似性网络的稀疏化处理显著提高了模型的预测性能。
{"title":"A novel framework for phage-host prediction via logical probability theory and network sparsification.","authors":"Ankang Wei, Huanghan Zhan, Zhen Xiao, Weizhong Zhao, Xingpeng Jiang","doi":"10.1093/bib/bbae708","DOIUrl":"10.1093/bib/bbae708","url":null,"abstract":"<p><p>Bacterial resistance has emerged as one of the greatest threats to human health, and phages have shown tremendous potential in addressing the issue of drug-resistant bacteria by lysing host. The identification of phage-host interactions (PHI) is crucial for addressing bacterial infections. Some existing computational methods for predicting PHI are suboptimal in terms of prediction efficiency due to the limited types of available information. Despite the emergence of some supporting information, the generalizability of models using this information is limited by the small scale of the databases. Additionally, most existing models overlook the sparsity of association data, which severely impacts their predictive performance as well. In this study, we propose a dual-view sparse network model (DSPHI) to predict PHI, which leverages logical probability theory and network sparsification. Specifically, we first constructed similarity networks using the sequences of phages and hosts respectively, and then sparsified these networks, enabling the model to focus more on key information during the learning process, thereby improving prediction efficiency. Next, we utilize logical probability theory to compute high-order logical information between phages (hosts), which is known as mutual information. Subsequently, we connect this information in node form to the sparse phage (host) similarity network, resulting in a phage (host) heterogeneous network that better integrates the two information views, thereby reducing the complexity of model computation and enhancing information aggregation capabilities. The hidden features of phages and hosts are explored through graph learning algorithms. Experimental results demonstrate that mutual information is effective information in predicting PHI, and the sparsification procedure of similarity networks significantly improves the model's predictive performance.</p>","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"26 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11711101/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142944458","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ProtGraph: a tool for the quick and comprehensive exploration and exploitation of the peptide search space derived from protein sequence databases using graphs. ProtGraph:一个工具,用于快速和全面的探索和利用肽搜索空间衍生的蛋白质序列数据库使用图形。
IF 6.8 2区 生物学 Q1 BIOCHEMICAL RESEARCH METHODS Pub Date : 2024-11-22 DOI: 10.1093/bib/bbae671
Dominik Lux, Katrin Marcus-Alic, Martin Eisenacher, Julian Uszkoreit

Due to computational resource limitations, in mass spectrometry based proteomics only a limited set of peptide sequences is used for the matching against measured spectra. We present an approach to represent proteins by graphs and allow not only the canonical sequences but also known isoforms and annotated amino acid variations, e.g. originating from genomic mutations, and further common protein sequence features contained in Uniprot KB or other protein databases. Our C++ and Python implementation enables a groundbreaking comprehensive characterization of the peptide search space, encompassing for the first time all available annotations in a protein database (in combination more than $10^{200}$ possibilities). Additionally, it can be used to quickly extract the relevant subset of the search space for peptide to spectrum matching, e.g. filtering by the peptide mass. We demonstrate the advantages and innovative findings of our implementation compared to previous workflows by re-analysing publicly available datasets.

由于计算资源的限制,在基于质谱的蛋白质组学中,只有一组有限的肽序列用于与测量光谱的匹配。我们提出了一种用图形表示蛋白质的方法,不仅允许规范序列,还允许已知的同型异构体和注释的氨基酸变异,例如源自基因组突变,以及包含在Uniprot KB或其他蛋白质数据库中的进一步常见蛋白质序列特征。我们的c++和Python实现实现了突破性的肽搜索空间的全面表征,首次包含了蛋白质数据库中所有可用的注释(组合超过10^{200}$的可能性)。此外,它还可以用于快速提取肽搜索空间的相关子集以进行谱匹配,例如通过肽质量进行过滤。通过重新分析公开可用的数据集,我们展示了与以前的工作流程相比,我们实现的优势和创新发现。
{"title":"ProtGraph: a tool for the quick and comprehensive exploration and exploitation of the peptide search space derived from protein sequence databases using graphs.","authors":"Dominik Lux, Katrin Marcus-Alic, Martin Eisenacher, Julian Uszkoreit","doi":"10.1093/bib/bbae671","DOIUrl":"https://doi.org/10.1093/bib/bbae671","url":null,"abstract":"<p><p>Due to computational resource limitations, in mass spectrometry based proteomics only a limited set of peptide sequences is used for the matching against measured spectra. We present an approach to represent proteins by graphs and allow not only the canonical sequences but also known isoforms and annotated amino acid variations, e.g. originating from genomic mutations, and further common protein sequence features contained in Uniprot KB or other protein databases. Our C++ and Python implementation enables a groundbreaking comprehensive characterization of the peptide search space, encompassing for the first time all available annotations in a protein database (in combination more than $10^{200}$ possibilities). Additionally, it can be used to quickly extract the relevant subset of the search space for peptide to spectrum matching, e.g. filtering by the peptide mass. We demonstrate the advantages and innovative findings of our implementation compared to previous workflows by re-analysing publicly available datasets.</p>","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"26 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142930610","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Correction to: BANMF-S: a blockwise accelerated non-negative matrix factorization framework with structural network constraints for single cell imputation. BANMF-S:用于单细胞输入的具有结构网络约束的块加速非负矩阵分解框架。
IF 6.8 2区 生物学 Q1 BIOCHEMICAL RESEARCH METHODS Pub Date : 2024-11-22 DOI: 10.1093/bib/bbaf034
{"title":"Correction to: BANMF-S: a blockwise accelerated non-negative matrix factorization framework with structural network constraints for single cell imputation.","authors":"","doi":"10.1093/bib/bbaf034","DOIUrl":"https://doi.org/10.1093/bib/bbaf034","url":null,"abstract":"","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"26 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11735464/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143000430","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Deep learning-based design and experimental validation of a medicine-like human antibody library.
IF 6.8 2区 生物学 Q1 BIOCHEMICAL RESEARCH METHODS Pub Date : 2024-11-22 DOI: 10.1093/bib/bbaf023
Nandhini Rajagopal, Udit Choudhary, Kenny Tsang, Kyle P Martin, Murat Karadag, Hsin-Ting Chen, Na-Young Kwon, Joseph Mozdzierz, Alexander M Horspool, Li Li, Peter M Tessier, Michael S Marlow, Andrew E Nixon, Sandeep Kumar

Antibody generation requires the use of one or more time-consuming methods, namely animal immunization, and in vitro display technologies. However, the recent availability of large amounts of antibody sequence and structural data in the public domain along with the advent of generative deep learning algorithms raises the possibility of computationally generating novel antibody sequences with desirable developability attributes. Here, we describe a deep learning model for computationally generating libraries of highly human antibody variable regions whose intrinsic physicochemical properties resemble those of the variable regions of the marketed antibody-based biotherapeutics (medicine-likeness). We generated 100000 variable region sequences of antigen-agnostic human antibodies belonging to the IGHV3-IGKV1 germline pair using a training dataset of 31416 human antibodies that satisfied our computational developability criteria. The in-silico generated antibodies recapitulate intrinsic sequence, structural, and physicochemical properties of the training antibodies, and compare favorably with the experimentally measured biophysical attributes of 100 variable regions of marketed and clinical stage antibody-based biotherapeutics. A sample of 51 highly diverse in-silico generated antibodies with >90th percentile medicine-likeness and > 90% humanness was evaluated by two independent experimental laboratories. Our data show the in-silico generated sequences exhibit high expression, monomer content, and thermal stability along with low hydrophobicity, self-association, and non-specific binding when produced as full-length monoclonal antibodies. The ability to computationally generate developable human antibody libraries is a first step towards enabling in-silico discovery of antibody-based biotherapeutics. These findings are expected to accelerate in-silico discovery of antibody-based biotherapeutics and expand the druggable antigen space to include targets refractory to conventional antibody discovery methods requiring in vitro antigen production.

{"title":"Deep learning-based design and experimental validation of a medicine-like human antibody library.","authors":"Nandhini Rajagopal, Udit Choudhary, Kenny Tsang, Kyle P Martin, Murat Karadag, Hsin-Ting Chen, Na-Young Kwon, Joseph Mozdzierz, Alexander M Horspool, Li Li, Peter M Tessier, Michael S Marlow, Andrew E Nixon, Sandeep Kumar","doi":"10.1093/bib/bbaf023","DOIUrl":"10.1093/bib/bbaf023","url":null,"abstract":"<p><p>Antibody generation requires the use of one or more time-consuming methods, namely animal immunization, and in vitro display technologies. However, the recent availability of large amounts of antibody sequence and structural data in the public domain along with the advent of generative deep learning algorithms raises the possibility of computationally generating novel antibody sequences with desirable developability attributes. Here, we describe a deep learning model for computationally generating libraries of highly human antibody variable regions whose intrinsic physicochemical properties resemble those of the variable regions of the marketed antibody-based biotherapeutics (medicine-likeness). We generated 100000 variable region sequences of antigen-agnostic human antibodies belonging to the IGHV3-IGKV1 germline pair using a training dataset of 31416 human antibodies that satisfied our computational developability criteria. The in-silico generated antibodies recapitulate intrinsic sequence, structural, and physicochemical properties of the training antibodies, and compare favorably with the experimentally measured biophysical attributes of 100 variable regions of marketed and clinical stage antibody-based biotherapeutics. A sample of 51 highly diverse in-silico generated antibodies with >90th percentile medicine-likeness and > 90% humanness was evaluated by two independent experimental laboratories. Our data show the in-silico generated sequences exhibit high expression, monomer content, and thermal stability along with low hydrophobicity, self-association, and non-specific binding when produced as full-length monoclonal antibodies. The ability to computationally generate developable human antibody libraries is a first step towards enabling in-silico discovery of antibody-based biotherapeutics. These findings are expected to accelerate in-silico discovery of antibody-based biotherapeutics and expand the druggable antigen space to include targets refractory to conventional antibody discovery methods requiring in vitro antigen production.</p>","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"26 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11757908/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143027968","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Therapeutic gene target prediction using novel deep hypergraph representation learning.
IF 6.8 2区 生物学 Q1 BIOCHEMICAL RESEARCH METHODS Pub Date : 2024-11-22 DOI: 10.1093/bib/bbaf019
Kibeom Kim, Juseong Kim, Minwook Kim, Hyewon Lee, Giltae Song

Identifying therapeutic genes is crucial for developing treatments targeting genetic causes of diseases, but experimental trials are costly and time-consuming. Although many deep learning approaches aim to identify biomarker genes, predicting therapeutic target genes remains challenging due to the limited number of known targets. To address this, we propose HIT (Hypergraph Interaction Transformer), a deep hypergraph representation learning model that identifies a gene's therapeutic potential, biomarker status, or lack of association with diseases. HIT uses hypergraph structures of genes, ontologies, diseases, and phenotypes, employing attention-based learning to capture complex relationships. Experiments demonstrate HIT's state-of-the-art performance, explainability, and ability to identify novel therapeutic targets.

{"title":"Therapeutic gene target prediction using novel deep hypergraph representation learning.","authors":"Kibeom Kim, Juseong Kim, Minwook Kim, Hyewon Lee, Giltae Song","doi":"10.1093/bib/bbaf019","DOIUrl":"10.1093/bib/bbaf019","url":null,"abstract":"<p><p>Identifying therapeutic genes is crucial for developing treatments targeting genetic causes of diseases, but experimental trials are costly and time-consuming. Although many deep learning approaches aim to identify biomarker genes, predicting therapeutic target genes remains challenging due to the limited number of known targets. To address this, we propose HIT (Hypergraph Interaction Transformer), a deep hypergraph representation learning model that identifies a gene's therapeutic potential, biomarker status, or lack of association with diseases. HIT uses hypergraph structures of genes, ontologies, diseases, and phenotypes, employing attention-based learning to capture complex relationships. Experiments demonstrate HIT's state-of-the-art performance, explainability, and ability to identify novel therapeutic targets.</p>","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"26 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11752618/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143022258","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
PHPGAT: predicting phage hosts based on multimodal heterogeneous knowledge graph with graph attention network. PHPGAT:基于多模态异构知识图和图注意网络的噬菌体宿主预测。
IF 6.8 2区 生物学 Q1 BIOCHEMICAL RESEARCH METHODS Pub Date : 2024-11-22 DOI: 10.1093/bib/bbaf017
Fu Liu, Zhimiao Zhao, Yun Liu

Antibiotic resistance poses a significant threat to global health, making the development of alternative strategies to combat bacterial pathogens increasingly urgent. One such promising approach is the strategic use of bacteriophages (or phages) to specifically target and eradicate antibiotic-resistant bacteria. Phages, being among the most prevalent life forms on Earth, play a critical role in maintaining ecological balance by regulating bacterial communities and driving genetic diversity. Accurate prediction of phage hosts is essential for successfully applying phage therapy. However, existing prediction models may not fully encapsulate the complex dynamics of phage-host interactions in diverse microbial environments, indicating a need for improved accuracy through more sophisticated modeling techniques. In response to this challenge, this study introduces a novel phage-host prediction model, PHPGAT, which leverages a multimodal heterogeneous knowledge graph with the advanced GATv2 (Graph Attention Network v2) framework. The model first constructs a multimodal heterogeneous knowledge graph by integrating phage-phage, host-host, and phage-host interactions to capture the intricate connections between biological entities. GATv2 is then employed to extract deep node features and learn dynamic interdependencies, generating context-aware embeddings. Finally, an inner product decoder is designed to compute the likelihood of interaction between a phage and host pair based on the embedding vectors produced by GATv2. Evaluation results using two datasets demonstrate that PHPGAT achieves precise phage host predictions and outperforms other models. PHPGAT is available at https://github.com/ZhaoZMer/PHPGAT.

抗生素耐药性对全球健康构成重大威胁,使得制定对抗细菌性病原体的替代战略日益紧迫。其中一种有希望的方法是战略性地使用噬菌体(或噬菌体)来特异性地靶向和根除抗生素抗性细菌。噬菌体是地球上最常见的生命形式之一,通过调节细菌群落和推动遗传多样性,在维持生态平衡方面发挥着关键作用。准确预测噬菌体宿主对噬菌体治疗的成功应用至关重要。然而,现有的预测模型可能不能完全概括不同微生物环境中噬菌体-宿主相互作用的复杂动态,这表明需要通过更复杂的建模技术来提高准确性。为了应对这一挑战,本研究引入了一种新的噬菌体-宿主预测模型PHPGAT,该模型利用多模态异构知识图和先进的GATv2 (graph Attention Network v2)框架。该模型首先通过整合噬菌体、宿主-宿主和噬菌体-宿主相互作用构建了一个多模态异构知识图谱,以捕捉生物实体之间的复杂联系。然后使用GATv2提取深度节点特征并学习动态相互依赖关系,生成上下文感知嵌入。最后,设计了一个内积解码器,根据GATv2产生的嵌入载体计算噬菌体和宿主对之间相互作用的可能性。使用两个数据集的评估结果表明,PHPGAT实现了精确的噬菌体宿主预测,并且优于其他模型。PHPGAT可从https://github.com/ZhaoZMer/PHPGAT获得。
{"title":"PHPGAT: predicting phage hosts based on multimodal heterogeneous knowledge graph with graph attention network.","authors":"Fu Liu, Zhimiao Zhao, Yun Liu","doi":"10.1093/bib/bbaf017","DOIUrl":"10.1093/bib/bbaf017","url":null,"abstract":"<p><p>Antibiotic resistance poses a significant threat to global health, making the development of alternative strategies to combat bacterial pathogens increasingly urgent. One such promising approach is the strategic use of bacteriophages (or phages) to specifically target and eradicate antibiotic-resistant bacteria. Phages, being among the most prevalent life forms on Earth, play a critical role in maintaining ecological balance by regulating bacterial communities and driving genetic diversity. Accurate prediction of phage hosts is essential for successfully applying phage therapy. However, existing prediction models may not fully encapsulate the complex dynamics of phage-host interactions in diverse microbial environments, indicating a need for improved accuracy through more sophisticated modeling techniques. In response to this challenge, this study introduces a novel phage-host prediction model, PHPGAT, which leverages a multimodal heterogeneous knowledge graph with the advanced GATv2 (Graph Attention Network v2) framework. The model first constructs a multimodal heterogeneous knowledge graph by integrating phage-phage, host-host, and phage-host interactions to capture the intricate connections between biological entities. GATv2 is then employed to extract deep node features and learn dynamic interdependencies, generating context-aware embeddings. Finally, an inner product decoder is designed to compute the likelihood of interaction between a phage and host pair based on the embedding vectors produced by GATv2. Evaluation results using two datasets demonstrate that PHPGAT achieves precise phage host predictions and outperforms other models. PHPGAT is available at https://github.com/ZhaoZMer/PHPGAT.</p>","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"26 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11745545/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143000356","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
AntiBinder: utilizing bidirectional attention and hybrid encoding for precise antibody-antigen interaction prediction. AntiBinder:利用双向注意和混合编码进行精确的抗体-抗原相互作用预测。
IF 6.8 2区 生物学 Q1 BIOCHEMICAL RESEARCH METHODS Pub Date : 2024-11-22 DOI: 10.1093/bib/bbaf008
Kaiwen Zhang, Yuhao Tao, Fei Wang

Antibodies play a key role in medical diagnostics and therapeutics. Accurately predicting antibody-antigen binding is essential for developing effective treatments. Traditional protein-protein interaction prediction methods often fall short because they do not account for the unique structural and dynamic properties of antibodies and antigens. In this study, we present AntiBinder, a novel predictive model specifically designed to address these challenges. AntiBinder integrates the unique structural and sequence characteristics of antibodies and antigens into its framework and employs a bidirectional cross-attention mechanism to automatically learn the intrinsic mechanisms of antigen-antibody binding, eliminating the need for manual feature engineering. Our comprehensive experiments, which include predicting interactions between known antigens and new antibodies, predicting the binding of previously unseen antigens, and predicting cross-species antigen-antibody interactions, demonstrate that AntiBinder outperforms existing state-of-the-art methods. Notably, AntiBinder excels in predicting interactions with unseen antigens and maintains a reasonable level of predictive capability in challenging cross-species prediction tasks. AntiBinder's ability to model complex antigen-antibody interactions highlights its potential applications in biomedical research and therapeutic development, including the design of vaccines and antibody therapies for rapidly emerging infectious diseases.

抗体在医学诊断和治疗中发挥着关键作用。准确预测抗体-抗原结合对于开发有效的治疗方法至关重要。传统的蛋白-蛋白相互作用预测方法往往存在不足,因为它们没有考虑到抗体和抗原的独特结构和动态特性。在这项研究中,我们提出了AntiBinder,一种专门设计用于解决这些挑战的新型预测模型。AntiBinder将抗体和抗原独特的结构和序列特征整合到其框架中,采用双向交叉注意机制自动学习抗原-抗体结合的内在机制,消除了人工特征工程的需要。我们的综合实验,包括预测已知抗原和新抗体之间的相互作用,预测以前看不见的抗原的结合,以及预测跨物种抗原-抗体相互作用,证明AntiBinder优于现有的最先进的方法。值得注意的是,AntiBinder在预测与未知抗原的相互作用方面表现出色,并在具有挑战性的跨物种预测任务中保持了合理的预测能力。AntiBinder模拟复杂抗原-抗体相互作用的能力突出了其在生物医学研究和治疗开发中的潜在应用,包括为快速出现的传染病设计疫苗和抗体疗法。
{"title":"AntiBinder: utilizing bidirectional attention and hybrid encoding for precise antibody-antigen interaction prediction.","authors":"Kaiwen Zhang, Yuhao Tao, Fei Wang","doi":"10.1093/bib/bbaf008","DOIUrl":"10.1093/bib/bbaf008","url":null,"abstract":"<p><p>Antibodies play a key role in medical diagnostics and therapeutics. Accurately predicting antibody-antigen binding is essential for developing effective treatments. Traditional protein-protein interaction prediction methods often fall short because they do not account for the unique structural and dynamic properties of antibodies and antigens. In this study, we present AntiBinder, a novel predictive model specifically designed to address these challenges. AntiBinder integrates the unique structural and sequence characteristics of antibodies and antigens into its framework and employs a bidirectional cross-attention mechanism to automatically learn the intrinsic mechanisms of antigen-antibody binding, eliminating the need for manual feature engineering. Our comprehensive experiments, which include predicting interactions between known antigens and new antibodies, predicting the binding of previously unseen antigens, and predicting cross-species antigen-antibody interactions, demonstrate that AntiBinder outperforms existing state-of-the-art methods. Notably, AntiBinder excels in predicting interactions with unseen antigens and maintains a reasonable level of predictive capability in challenging cross-species prediction tasks. AntiBinder's ability to model complex antigen-antibody interactions highlights its potential applications in biomedical research and therapeutic development, including the design of vaccines and antibody therapies for rapidly emerging infectious diseases.</p>","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"26 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11744619/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143000424","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Graph contrastive learning of subcellular-resolution spatial transcriptomics improves cell type annotation and reveals critical molecular pathways.
IF 6.8 2区 生物学 Q1 BIOCHEMICAL RESEARCH METHODS Pub Date : 2024-11-22 DOI: 10.1093/bib/bbaf020
Qiaolin Lu, Jiayuan Ding, Lingxiao Li, Yi Chang

Imaging-based spatial transcriptomics (iST), such as MERFISH, CosMx SMI, and Xenium, quantify gene expression level across cells in space, but more importantly, they directly reveal the subcellular distribution of RNA transcripts at the single-molecule resolution. The subcellular localization of RNA molecules plays a crucial role in the compartmentalization-dependent regulation of genes within individual cells. Understanding the intracellular spatial distribution of RNA for a particular cell type thus not only improves the characterization of cell identity but also is of paramount importance in elucidating unique subcellular regulatory mechanisms specific to the cell type. However, current cell type annotation approaches of iST primarily utilize gene expression information while neglecting the spatial distribution of RNAs within cells. In this work, we introduce a semi-supervised graph contrastive learning method called Focus, the first method, to the best of our knowledge, that explicitly models RNA's subcellular distribution and community to improve cell type annotation. Focus demonstrates significant improvements over state-of-the-art algorithms across a range of spatial transcriptomics platforms, achieving improvements up to 27.8% in terms of accuracy and 51.9% in terms of F1-score for cell type annotation. Furthermore, Focus enjoys the advantages of intricate cell type-specific subcellular spatial gene patterns and providing interpretable subcellular gene analysis, such as defining the gene importance score. Importantly, with the importance score, Focus identifies genes harboring strong relevance to cell type-specific pathways, indicating its potential in uncovering novel regulatory programs across numerous biological systems.

{"title":"Graph contrastive learning of subcellular-resolution spatial transcriptomics improves cell type annotation and reveals critical molecular pathways.","authors":"Qiaolin Lu, Jiayuan Ding, Lingxiao Li, Yi Chang","doi":"10.1093/bib/bbaf020","DOIUrl":"10.1093/bib/bbaf020","url":null,"abstract":"<p><p>Imaging-based spatial transcriptomics (iST), such as MERFISH, CosMx SMI, and Xenium, quantify gene expression level across cells in space, but more importantly, they directly reveal the subcellular distribution of RNA transcripts at the single-molecule resolution. The subcellular localization of RNA molecules plays a crucial role in the compartmentalization-dependent regulation of genes within individual cells. Understanding the intracellular spatial distribution of RNA for a particular cell type thus not only improves the characterization of cell identity but also is of paramount importance in elucidating unique subcellular regulatory mechanisms specific to the cell type. However, current cell type annotation approaches of iST primarily utilize gene expression information while neglecting the spatial distribution of RNAs within cells. In this work, we introduce a semi-supervised graph contrastive learning method called Focus, the first method, to the best of our knowledge, that explicitly models RNA's subcellular distribution and community to improve cell type annotation. Focus demonstrates significant improvements over state-of-the-art algorithms across a range of spatial transcriptomics platforms, achieving improvements up to 27.8% in terms of accuracy and 51.9% in terms of F1-score for cell type annotation. Furthermore, Focus enjoys the advantages of intricate cell type-specific subcellular spatial gene patterns and providing interpretable subcellular gene analysis, such as defining the gene importance score. Importantly, with the importance score, Focus identifies genes harboring strong relevance to cell type-specific pathways, indicating its potential in uncovering novel regulatory programs across numerous biological systems.</p>","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"26 1","pages":""},"PeriodicalIF":6.8,"publicationDate":"2024-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11781232/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143063829","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Briefings in bioinformatics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1