Pub Date : 2023-12-01DOI: 10.1016/j.gpb.2023.04.002
Ann-Yae Na , Hyojin Lee , Eun Ki Min , Sanjita Paudel , So Young Choi , HyunChae Sim , Kwang-Hyeon Liu , Ki-Tae Kim , Jong-Sup Bae , Sangkyu Lee
The recently developed technologies that allow the analysis of each single omics have provided an unbiased insight into ongoing disease processes. However, it remains challenging to specify the study design for the subsequent integration strategies that can associate sepsis pathophysiology and clinical outcomes. Here, we conducted a time-dependent multi-omics integration (TDMI) in a sepsis-associated liver dysfunction (SALD) model. We successfully deduced the relation of the Toll-like receptor 4 (TLR4) pathway with SALD. Although TLR4 is a critical factor in sepsis progression, it is not specified in single-omics analyses but only in the TDMI analysis. This finding indicates that the TDMI-based approach is more advantageous than single-omics analyses in terms of exploring the underlying pathophysiological mechanism of SALD. Furthermore, TDMI-based approach can be an ideal paradigm for insightful biological interpretations of multi-omics datasets that will potentially reveal novel insights into basic biology, health, and diseases, thus allowing the identification of promising candidates for therapeutic strategies.
{"title":"Novel Time-dependent Multi-omics Integration in Sepsis-associated Liver Dysfunction","authors":"Ann-Yae Na , Hyojin Lee , Eun Ki Min , Sanjita Paudel , So Young Choi , HyunChae Sim , Kwang-Hyeon Liu , Ki-Tae Kim , Jong-Sup Bae , Sangkyu Lee","doi":"10.1016/j.gpb.2023.04.002","DOIUrl":"10.1016/j.gpb.2023.04.002","url":null,"abstract":"<div><div>The recently developed technologies that allow the analysis of each <strong>single omics</strong> have provided an unbiased insight into ongoing disease processes. However, it remains challenging to specify the study design for the subsequent integration strategies that can associate sepsis pathophysiology and clinical outcomes. Here, we conducted a time-dependent <strong>multi-omics</strong> integration (TDMI) in a <strong>sepsis-associated liver dysfunction</strong> (SALD) model. We successfully deduced the relation of the Toll-like receptor 4 (TLR4) pathway with SALD. Although TLR4 is a critical factor in sepsis progression, it is not specified in single-omics analyses but only in the TDMI analysis. This finding indicates that the TDMI-based approach is more advantageous than single-omics analyses in terms of exploring the underlying pathophysiological mechanism of SALD. Furthermore, TDMI-based approach can be an ideal paradigm for insightful biological interpretations of multi-omics datasets that will potentially reveal novel insights into basic biology, health, and diseases, thus allowing the identification of promising candidates for therapeutic strategies.</div></div>","PeriodicalId":12528,"journal":{"name":"Genomics, Proteomics & Bioinformatics","volume":"21 6","pages":"Pages 1101-1116"},"PeriodicalIF":11.5,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11082264/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9422024","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Post-translational modifications (PTMs) have key roles in extending the functional diversity of proteins and, as a result, regulating diverse cellular processes in prokaryotic and eukaryotic organisms. Phosphorylation modification is a vital PTM that occurs in most proteins and plays a significant role in many biological processes. Disorders in the phosphorylation process lead to multiple diseases, including neurological disorders and cancers. The purpose of this review is to organize this body of knowledge associated with phosphorylation site (p-site) prediction to facilitate future research in this field. At first, we comprehensively review all related databases and introduce all steps regarding dataset creation, data preprocessing, and method evaluation in p-site prediction. Next, we investigate p-site prediction methods, which are divided into two computational groups: algorithmic and machine learning (ML). Additionally, it is shown that there are basically two main approaches for p-site prediction by ML: conventional and end-to-end deep learning methods, both of which are given an overview. Moreover, this review introduces the most important feature extraction techniques, which have mostly been used in p-site prediction. Finally, we create three test sets from new proteins related to the released version of the database of protein post-translational modifications (dbPTM) in 2022 based on general and human species. Evaluating online p-site prediction tools on newly added proteins introduced in the dbPTM 2022 release, distinct from those in the dbPTM 2019 release, reveals their limitations. In other words, the actual performance of these online p-site prediction tools on unseen proteins is notably lower than the results reported in their respective research papers.
{"title":"A Review of Machine Learning and Algorithmic Methods for Protein Phosphorylation Site Prediction","authors":"Farzaneh Esmaili , Mahdi Pourmirzaei , Shahin Ramazi , Seyedehsamaneh Shojaeilangari , Elham Yavari","doi":"10.1016/j.gpb.2023.03.007","DOIUrl":"10.1016/j.gpb.2023.03.007","url":null,"abstract":"<div><div><strong>Post-translational modifications</strong> (PTMs) have key roles in extending the functional diversity of proteins and, as a result, regulating diverse cellular processes in prokaryotic and eukaryotic organisms. <strong>Phosphorylation</strong> modification is a vital PTM that occurs in most proteins and plays a significant role in many biological processes. Disorders in the phosphorylation process lead to multiple diseases, including neurological disorders and cancers. The purpose of this review is to organize this body of knowledge associated with phosphorylation site (p-site) prediction to facilitate future research in this field. At first, we comprehensively review all related <strong>databases</strong> and introduce all steps regarding dataset creation, data preprocessing, and method evaluation in p-site prediction. Next, we investigate p-site prediction methods, which are divided into two computational groups: algorithmic and <strong>machine learning</strong> (ML). Additionally, it is shown that there are basically two main approaches for p-site prediction by ML: conventional and end-to-end <strong>deep learning</strong> methods, both of which are given an overview. Moreover, this review introduces the most important feature extraction techniques, which have mostly been used in p-site prediction. Finally, we create three test sets from new proteins related to the released version of the database of protein post-translational modifications (dbPTM) in 2022 based on general and human species. Evaluating online p-site prediction tools on newly added proteins introduced in the dbPTM 2022 release, distinct from those in the dbPTM 2019 release, reveals their limitations. In other words, the actual performance of these online p-site prediction tools on unseen proteins is notably lower than the results reported in their respective research papers.</div></div>","PeriodicalId":12528,"journal":{"name":"Genomics, Proteomics & Bioinformatics","volume":"21 6","pages":"Pages 1266-1285"},"PeriodicalIF":11.5,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11082408/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49686555","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-12-01DOI: 10.1016/j.gpb.2022.09.002
Jiazhen Zheng , Yue Li , Ning Liu , Jihui Zhang , Shuangjiang Liu , Huarong Tan
Streptomycetes possess numerous gene clusters and the potential to produce a large amount of natural products. Histone deacetylase (HDAC) inhibitors play an important role in the regulation of histone modifications in fungi, but their roles in prokaryotes remain poorly understood. Here, we investigated the global effects of the HDAC inhibitor, sodium butyrate (SB), on marine-derived Streptomycesolivaceus FXJ 8.021, particularly focusing on the activation of secondary metabolite biosynthesis. The antiSMASH analysis revealed 33 secondary metabolite biosynthetic gene clusters (BGCs) in strain FXJ 8.021, among which the silent lobophorin BGC was activated by SB. Transcriptomic data showed that the expression of genes involved in lobophorin biosynthesis (ge00097–ge00139) and CoA-ester formation (e.g., ge02824), as well as the glycolysis/gluconeogenesis pathway (e.g., ge01661), was significantly up-regulated in the presence of SB. Intracellular CoA-ester analysis confirmed that SB triggered the biosynthesis of CoA-ester, thereby increasing the precursor supply for lobophorin biosynthesis. Further acetylomic analysis revealed that the acetylation levels on 218 sites of 190 proteins were up-regulated and those on 411 sites of 310 proteins were down-regulated. These acetylated proteins were particularly enriched in transcriptional and translational machinery components (e.g., elongation factor GE04399), and their correlations with the proteins involved in lobophorin biosynthesis were established by protein–protein interaction network analysis, suggesting that SB might function via a complex hierarchical regulation to activate the expression of lobophorin BGC. These findings provide solid evidence that acetylated proteins triggered by SB could affect the expression of genes involved in the biosynthesis of primary and secondary metabolites in prokaryotes.
{"title":"Multi-omics Data Reveal the Effect of Sodium Butyrate on Gene Expression and Protein Modification in Streptomyces","authors":"Jiazhen Zheng , Yue Li , Ning Liu , Jihui Zhang , Shuangjiang Liu , Huarong Tan","doi":"10.1016/j.gpb.2022.09.002","DOIUrl":"10.1016/j.gpb.2022.09.002","url":null,"abstract":"<div><div>Streptomycetes possess numerous gene clusters and the potential to produce a large amount of natural products. Histone deacetylase (HDAC) inhibitors play an important role in the regulation of histone modifications in fungi, but their roles in prokaryotes remain poorly understood. Here, we investigated the global effects of the HDAC inhibitor, <strong>sodium butyrate</strong> (SB), on marine-derived <strong><em>Streptomyces</em></strong> <em>olivaceus</em> FXJ 8.021, particularly focusing on the activation of secondary metabolite biosynthesis. The antiSMASH analysis revealed 33 secondary metabolite biosynthetic gene clusters (BGCs) in strain FXJ 8.021, among which the silent lobophorin BGC was activated by SB. Transcriptomic data showed that the expression of genes involved in lobophorin biosynthesis (<em>ge00097–ge00139</em>) and CoA-ester formation (<em>e.g.</em>, <em>ge02824</em>), as well as the glycolysis/gluconeogenesis pathway (<em>e.g</em>., <em>ge01661</em>), was significantly up-regulated in the presence of SB. Intracellular CoA-ester analysis confirmed that SB triggered the biosynthesis of CoA-ester, thereby increasing the precursor supply for lobophorin biosynthesis. Further acetylomic analysis revealed that the acetylation levels on 218 sites of 190 proteins were up-regulated and those on 411 sites of 310 proteins were down-regulated. These acetylated proteins were particularly enriched in transcriptional and translational machinery components (<em>e.g</em>., elongation factor GE04399), and their correlations with the proteins involved in lobophorin biosynthesis were established by protein–protein interaction network analysis, suggesting that SB might function via a complex hierarchical regulation to activate the expression of lobophorin BGC. These findings provide solid evidence that acetylated proteins triggered by SB could affect the expression of genes involved in the biosynthesis of primary and secondary metabolites in prokaryotes.</div></div>","PeriodicalId":12528,"journal":{"name":"Genomics, Proteomics & Bioinformatics","volume":"21 6","pages":"Pages 1149-1162"},"PeriodicalIF":11.5,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11082262/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"40363021","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-10-01DOI: 10.1016/j.gpb.2023.06.001
Yaojun Wang , Shiwei Sun
{"title":"Revolutionizing Antibody Discovery: An Innovative AI Model for Generating Robust Libraries","authors":"Yaojun Wang , Shiwei Sun","doi":"10.1016/j.gpb.2023.06.001","DOIUrl":"10.1016/j.gpb.2023.06.001","url":null,"abstract":"","PeriodicalId":12528,"journal":{"name":"Genomics, Proteomics & Bioinformatics","volume":"21 5","pages":"Pages 910-912"},"PeriodicalIF":11.5,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10928364/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9671806","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-10-01DOI: 10.1016/j.gpb.2023.03.001
Lin-Fang Ju , Heng-Ji Xu , Yun-Gui Yang , Ying Yang
During mammalian preimplantation development, a totipotent zygote undergoes several cell cleavages and two rounds of cell fate determination, ultimately forming a mature blastocyst. Along with compaction, the establishment of apicobasal cell polarity breaks the symmetry of an embryo and guides subsequent cell fate choice. Although the lineage segregation of the inner cell mass (ICM) and trophectoderm (TE) is the first symbol of cell differentiation, several molecules have been shown to bias the early cell fate through their inter-cellular variations at much earlier stages, including the 2- and 4-cell stages. The underlying mechanisms of early cell fate determination have long been an important research topic. In this review, we summarize the molecular events that occur during early embryogenesis, as well as the current understanding of their regulatory roles in cell fate decisions. Moreover, as powerful tools for early embryogenesis research, single-cell omics techniques have been applied to both mouse and human preimplantation embryos and have contributed to the discovery of cell fate regulators. Here, we summarize their applications in the research of preimplantation embryos, and provide new insights and perspectives on cell fate regulation.
{"title":"Omics Views of Mechanisms for Cell Fate Determination in Early Mammalian Development","authors":"Lin-Fang Ju , Heng-Ji Xu , Yun-Gui Yang , Ying Yang","doi":"10.1016/j.gpb.2023.03.001","DOIUrl":"10.1016/j.gpb.2023.03.001","url":null,"abstract":"<div><div>During mammalian preimplantation development, a totipotent zygote undergoes several cell cleavages and two rounds of <strong>cell fate determination</strong>, ultimately forming a mature blastocyst. Along with compaction, the establishment of apicobasal <strong>cell polarity</strong> breaks the symmetry of an embryo and guides subsequent cell fate choice. Although the lineage segregation of the inner cell mass (ICM) and trophectoderm (TE) is the first symbol of cell differentiation, several molecules have been shown to bias the early cell fate through their inter-cellular variations at much earlier stages, including the 2- and 4-cell stages. The underlying mechanisms of early cell fate determination have long been an important research topic. In this review, we summarize the molecular events that occur during early embryogenesis, as well as the current understanding of their regulatory roles in cell fate decisions. Moreover, as powerful tools for early embryogenesis research, <strong>single-cell omics</strong> techniques have been applied to both mouse and human preimplantation embryos and have contributed to the discovery of cell fate regulators. Here, we summarize their applications in the research of preimplantation embryos, and provide new insights and perspectives on cell fate regulation.</div></div>","PeriodicalId":12528,"journal":{"name":"Genomics, Proteomics & Bioinformatics","volume":"21 5","pages":"Pages 950-961"},"PeriodicalIF":11.5,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10928378/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10101436","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-10-01DOI: 10.1016/j.gpb.2022.08.002
Yicong Shen , Yuanxu Gao , Jiangcheng Shi , Zhou Huang , Rongbo Dai , Yi Fu , Yuan Zhou , Wei Kong , Qinghua Cui
Abdominal aortic aneurysm (AAA) is a permanent dilatation of the abdominal aorta and is highly lethal. The main purpose of the current study is to search for noninvasive medical therapies for AAA, for which there is currently no effective drug therapy. Network medicine represents a cutting-edge technology, as analysis and modeling of disease networks can provide critical clues regarding the etiology of specific diseases and therapeutics that may be effective. Here, we proposed a novel algorithm to quantify disease relations based on a large accumulated microRNA–disease association dataset and then built a disease network covering 15 disease classes and 304 diseases. Analysis revealed some patterns for these diseases. For instance, diseases tended to be clustered and coherent in the network. Surprisingly, we found that AAA showed the strongest similarity with rheumatoid arthritis and systemic lupus erythematosus, both of which are autoimmune diseases, suggesting that AAA could be one type of autoimmune diseases in etiology. Based on this observation, we further hypothesized that drugs for autoimmune diseases could be repurposed for the prevention and therapy of AAA. Finally, animal experiments confirmed that methotrexate, a drug for autoimmune diseases, was able to alleviate the formation and development of AAA.
{"title":"MicroRNA–disease Network Analysis Repurposes Methotrexate for the Treatment of Abdominal Aortic Aneurysm in Mice","authors":"Yicong Shen , Yuanxu Gao , Jiangcheng Shi , Zhou Huang , Rongbo Dai , Yi Fu , Yuan Zhou , Wei Kong , Qinghua Cui","doi":"10.1016/j.gpb.2022.08.002","DOIUrl":"10.1016/j.gpb.2022.08.002","url":null,"abstract":"<div><div><strong>Abdominal aortic aneurysm</strong> (AAA) is a permanent dilatation of the abdominal aorta and is highly lethal. The main purpose of the current study is to search for noninvasive medical therapies for AAA, for which there is currently no effective drug therapy. <strong>Network medicine</strong> represents a cutting-edge technology, as analysis and modeling of disease networks can provide critical clues regarding the etiology of specific diseases and therapeutics that may be effective. Here, we proposed a novel algorithm to quantify disease relations based on a large accumulated microRNA–disease association dataset and then built a disease network covering 15 disease classes and 304 diseases. Analysis revealed some patterns for these diseases. For instance, diseases tended to be clustered and coherent in the network. Surprisingly, we found that AAA showed the strongest similarity with rheumatoid arthritis and systemic lupus erythematosus, both of which are <strong>autoimmune diseases</strong>, suggesting that AAA could be one type of autoimmune diseases in etiology. Based on this observation, we further hypothesized that drugs for autoimmune diseases could be repurposed for the prevention and therapy of AAA. Finally, animal experiments confirmed that <strong>methotrexate</strong>, a drug for autoimmune diseases, was able to alleviate the formation and development of AAA.</div></div>","PeriodicalId":12528,"journal":{"name":"Genomics, Proteomics & Bioinformatics","volume":"21 5","pages":"Pages 1030-1042"},"PeriodicalIF":11.5,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10928436/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"40425409","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-10-01DOI: 10.1016/j.gpb.2023.02.004
Wenbin Li , Lin Gao , Xin Yi , Shuangfeng Shi , Jie Huang , Leming Shi , Xiaoyan Zhou , Lingying Wu , Jianming Ying
Defects in genes involved in the DNA damage response cause homologous recombination repair deficiency (HRD). HRD is found in a subgroup of cancer patients for several tumor types, and it has a clinical relevance to cancer prevention and therapies. Accumulating evidence has identified HRD as a biomarker for assessing the therapeutic response of tumor cells to poly(ADP-ribose) polymerase inhibitors and platinum-based chemotherapies. Nevertheless, the biology of HRD is complex, and its applications and the benefits of different HRD biomarker assays are controversial. This is primarily due to inconsistencies in HRD assessments and definitions (gene-level tests, genomic scars, mutational signatures, or a combination of these methods) and difficulties in assessing the contribution of each genomic event. Therefore, we aim to review the biological rationale and clinical evidence of HRD as a biomarker. This review provides a blueprint for the standardization and harmonization of HRD assessments.
{"title":"Patient Assessment and Therapy Planning Based on Homologous Recombination Repair Deficiency","authors":"Wenbin Li , Lin Gao , Xin Yi , Shuangfeng Shi , Jie Huang , Leming Shi , Xiaoyan Zhou , Lingying Wu , Jianming Ying","doi":"10.1016/j.gpb.2023.02.004","DOIUrl":"10.1016/j.gpb.2023.02.004","url":null,"abstract":"<div><div>Defects in genes involved in the <strong>DNA damage response</strong> cause <strong>homologous recombination repair deficiency</strong> (HRD). HRD is found in a subgroup of cancer patients for several tumor types, and it has a clinical relevance to cancer prevention and therapies. Accumulating evidence has identified HRD as a <strong>biomarker</strong> for assessing the therapeutic response of tumor cells to <strong>poly</strong><strong>(ADP-ribose) polymerase inhibitors</strong> and platinum-based chemotherapies. Nevertheless, the biology of HRD is complex, and its applications and the benefits of different HRD biomarker assays are controversial. This is primarily due to inconsistencies in HRD assessments and definitions (gene-level tests, genomic scars, mutational signatures, or a combination of these methods) and difficulties in assessing the contribution of each genomic event. Therefore, we aim to review the biological rationale and clinical evidence of HRD as a biomarker. This review provides a blueprint for the standardization and <strong>harmonization</strong> of HRD assessments.</div></div>","PeriodicalId":12528,"journal":{"name":"Genomics, Proteomics & Bioinformatics","volume":"21 5","pages":"Pages 962-975"},"PeriodicalIF":11.5,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10928375/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10737665","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-10-01DOI: 10.1016/j.gpb.2023.09.003
Enhui Jin , Dongli Zhao , Gangao Wu , Junwei Zhu , Zhonghuang Wang , Zhiyao Wei , Sisi Zhang , Anke Wang , Bixia Tang , Xu Chen , Yanling Sun , Zhe Zhang , Wenming Zhao , Yuanguang Meng
With the development of artificial intelligence (AI) technologies, biomedical imaging data play an important role in scientific research and clinical application, but the available resources are limited. Here we present Open Biomedical Imaging Archive (OBIA), a repository for archiving biomedical imaging and related clinical data. OBIA adopts five data objects (Collection, Individual, Study, Series, and Image) for data organization, and accepts the submission of biomedical images of multiple modalities, organs, and diseases. In order to protect personal privacy, OBIA has formulated a unified de-identification and quality control process. In addition, OBIA provides friendly and intuitive web interfaces for data submission, browsing, and retrieval, as well as image retrieval. As of September 2023, OBIA has housed data for a total of 937 individuals, 4136 studies, 24,701 series, and 1,938,309 images covering 9 modalities and 30 anatomical sites. Collectively, OBIA provides a reliable platform for biomedical imaging data management and offers free open access to all publicly available data to support research activities throughout the world. OBIA can be accessed at https://ngdc.cncb.ac.cn/obia.
{"title":"OBIA: An Open Biomedical Imaging Archive","authors":"Enhui Jin , Dongli Zhao , Gangao Wu , Junwei Zhu , Zhonghuang Wang , Zhiyao Wei , Sisi Zhang , Anke Wang , Bixia Tang , Xu Chen , Yanling Sun , Zhe Zhang , Wenming Zhao , Yuanguang Meng","doi":"10.1016/j.gpb.2023.09.003","DOIUrl":"10.1016/j.gpb.2023.09.003","url":null,"abstract":"<div><div>With the development of artificial intelligence (AI) technologies, <strong>biomedical imaging</strong> data play an important role in scientific research and clinical application, but the available resources are limited. Here we present <strong>Open Biomedical Imaging Archive</strong> (OBIA), a repository for archiving biomedical imaging and related clinical data. OBIA adopts five data objects (Collection, Individual, Study, Series, and Image) for data organization, and accepts the submission of biomedical images of multiple modalities, organs, and diseases. In order to protect personal privacy, OBIA has formulated a unified <strong>de-identification</strong> and <strong>quality control</strong> process. In addition, OBIA provides friendly and intuitive web interfaces for data submission, browsing, and retrieval, as well as image retrieval. As of September 2023, OBIA has housed data for a total of 937 individuals, 4136 studies, 24,701 series, and 1,938,309 images covering 9 modalities and 30 anatomical sites. Collectively, OBIA provides a reliable platform for biomedical imaging data management and offers free open access to all publicly available data to support research activities throughout the world. OBIA can be accessed at <span><span>https://ngdc.cncb.ac.cn/obia</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":12528,"journal":{"name":"Genomics, Proteomics & Bioinformatics","volume":"21 5","pages":"Pages 1059-1065"},"PeriodicalIF":11.5,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10928373/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41147344","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-10-01DOI: 10.1016/j.gpb.2023.05.001
Yuliang Pan , Ruiyi Li , Wengen Li , Liuzhenghao Lv , Jihong Guan , Shuigeng Zhou
A fundamental principle of biology is that proteins tend to form complexes to play important roles in the core functions of cells. For a complete understanding of human cellular functions, it is crucial to have a comprehensive atlas of human protein complexes. Unfortunately, we still lack such a comprehensive atlas of experimentally validated protein complexes, which prevents us from gaining a complete understanding of the compositions and functions of human protein complexes, as well as the underlying biological mechanisms. To fill this gap, we built Human Protein Complexes Atlas (HPC-Atlas), as far as we know, the most accurate and comprehensive atlas of human protein complexes available to date. We integrated two latest protein interaction networks, and developed a novel computational method to identify nearly 9000 protein complexes, including many previously uncharacterized complexes. Compared with the existing methods, our method achieved outstanding performance on both testing and independent datasets. Furthermore, with HPC-Atlas we identified 751 severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2)-affected human protein complexes, and 456 multifunctional proteins that contain many potential moonlighting proteins. These results suggest that HPC-Atlas can serve as not only a computing framework to effectively identify biologically meaningful protein complexes by integrating multiple protein data sources, but also a valuable resource for exploring new biological findings. The HPC-Atlas webserver is freely available at http://www.yulpan.top/HPC-Atlas.
{"title":"HPC-Atlas: Computationally Constructing A Comprehensive Atlas of Human Protein Complexes","authors":"Yuliang Pan , Ruiyi Li , Wengen Li , Liuzhenghao Lv , Jihong Guan , Shuigeng Zhou","doi":"10.1016/j.gpb.2023.05.001","DOIUrl":"10.1016/j.gpb.2023.05.001","url":null,"abstract":"<div><div>A fundamental principle of biology is that proteins tend to form complexes to play important roles in the core functions of cells. For a complete understanding of human cellular functions, it is crucial to have a comprehensive atlas of <strong>human protein complexes</strong>. Unfortunately, we still lack such a comprehensive atlas of experimentally validated protein complexes, which prevents us from gaining a complete understanding of the compositions and functions of human protein complexes, as well as the underlying biological mechanisms. To fill this gap, we built Human Protein Complexes Atlas (HPC-Atlas), as far as we know, the most accurate and comprehensive atlas of human protein complexes available to date. We integrated two latest <strong>protein interaction networks</strong>, and developed a novel computational method to identify nearly 9000 protein complexes, including many previously uncharacterized complexes. Compared with the existing methods, our method achieved outstanding performance on both testing and independent datasets. Furthermore, with HPC-Atlas we identified 751 severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2)-affected human protein complexes, and 456 <strong>multifunctional proteins</strong> that contain many potential moonlighting proteins. These results suggest that HPC-Atlas can serve as not only a computing framework to effectively identify biologically meaningful protein complexes by integrating multiple protein data sources, but also a valuable resource for exploring new biological findings. The HPC-Atlas webserver is freely available at <span><span>http://www.yulpan.top/HPC-Atlas</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":12528,"journal":{"name":"Genomics, Proteomics & Bioinformatics","volume":"21 5","pages":"Pages 976-990"},"PeriodicalIF":11.5,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10928439/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41124619","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Antibody leads must fulfill multiple desirable properties to be clinical candidates. Primarily due to the low throughput in the experimental procedure, the need for such multi-property optimization causes the bottleneck in preclinical antibody discovery and development, because addressing one issue usually causes another. We developed a reinforcement learning (RL) method, named AB-Gen, for antibody library design using a generative pre-trained transformer (GPT) as the policy network of the RL agent. We showed that this model can learn the antibody space of heavy chain complementarity determining region 3 (CDRH3) and generate sequences with similar property distributions. Besides, when using human epidermal growth factor receptor-2 (HER2) as the target, the agent model of AB-Gen was able to generate novel CDRH3 sequences that fulfill multi-property constraints. Totally, 509 generated sequences were able to pass all property filters, and three highly conserved residues were identified. The importance of these residues was further demonstrated by molecular dynamics simulations, consolidating that the agent model was capable of grasping important information in this complex optimization task. Overall, the AB-Gen method is able to design novel antibody sequences with an improved success rate than the traditional propose-then-filter approach. It has the potential to be used in practical antibody design, thus empowering the antibody discovery and development process. The source code of AB-Gen is freely available at Zenodo (https://doi.org/10.5281/zenodo.7657016) and BioCode (https://ngdc.cncb.ac.cn/biocode/tools/BT007341).
{"title":"AB-Gen: Antibody Library Design with Generative Pre-trained Transformer and Deep Reinforcement Learning","authors":"Xiaopeng Xu , Tiantian Xu , Juexiao Zhou , Xingyu Liao , Ruochi Zhang , Yu Wang , Lu Zhang , Xin Gao","doi":"10.1016/j.gpb.2023.03.004","DOIUrl":"10.1016/j.gpb.2023.03.004","url":null,"abstract":"<div><div>Antibody leads must fulfill multiple desirable properties to be clinical candidates. Primarily due to the low throughput in the experimental procedure, the need for such multi-property optimization causes the bottleneck in preclinical antibody discovery and development, because addressing one issue usually causes another. We developed a <strong>reinforcement learning</strong> (RL) method, named AB-Gen, for antibody library design using a generative pre-trained <strong>transformer</strong> (GPT) as the policy network of the RL agent. We showed that this model can learn the antibody space of heavy chain complementarity determining region 3 (CDRH3) and generate sequences with similar property distributions. Besides, when using human epidermal growth factor receptor-2 (HER2) as the target, the agent model of AB-Gen was able to generate novel CDRH3 sequences that fulfill multi-property constraints. Totally, 509 generated sequences were able to pass all property filters, and three highly conserved residues were identified. The importance of these residues was further demonstrated by molecular dynamics simulations, consolidating that the agent model was capable of grasping important information in this complex optimization task. Overall, the AB-Gen method is able to design novel antibody sequences with an improved success rate than the traditional propose-then-filter approach. It has the potential to be used in practical antibody design, thus empowering the antibody discovery and development process. The source code of AB-Gen is freely available at Zenodo (<span><span>https://doi.org/10.5281/zenodo.7657016</span><svg><path></path></svg></span>) and BioCode (<span><span>https://ngdc.cncb.ac.cn/biocode/tools/BT007341</span><svg><path></path></svg></span>).</div></div>","PeriodicalId":12528,"journal":{"name":"Genomics, Proteomics & Bioinformatics","volume":"21 5","pages":"Pages 1043-1053"},"PeriodicalIF":11.5,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10928431/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10045398","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}