首页 > 最新文献

Genomics, proteomics & bioinformatics最新文献

英文 中文
Biological Data Resources and Machine Learning Frameworks for Hematology Research.
Pub Date : 2025-03-04 DOI: 10.1093/gpbjnl/qzaf021
Ying Yi, Yongfei Hu, Juanjuan Kang, Qifa Liu, Yan Huang, Dong Wang

Hematology research has greatly benefited from the integration of diverse biological data resources and advanced machine learning frameworks. This integration has not only deepened our understanding of blood diseases such as leukemia and lymphoma, but also enhanced diagnostic accuracy and personalized treatment strategies. By applying machine learning algorithms to analyze large-scale biological data, researchers are able to more effectively identify disease patterns, predict treatment responses, and provide new perspectives for the diagnosis and treatment of hematologic disorders. Here, we provide an overview of the current landscape of biological data resources and the application of machine learning frameworks pertinent to hematology research.

{"title":"Biological Data Resources and Machine Learning Frameworks for Hematology Research.","authors":"Ying Yi, Yongfei Hu, Juanjuan Kang, Qifa Liu, Yan Huang, Dong Wang","doi":"10.1093/gpbjnl/qzaf021","DOIUrl":"https://doi.org/10.1093/gpbjnl/qzaf021","url":null,"abstract":"<p><p>Hematology research has greatly benefited from the integration of diverse biological data resources and advanced machine learning frameworks. This integration has not only deepened our understanding of blood diseases such as leukemia and lymphoma, but also enhanced diagnostic accuracy and personalized treatment strategies. By applying machine learning algorithms to analyze large-scale biological data, researchers are able to more effectively identify disease patterns, predict treatment responses, and provide new perspectives for the diagnosis and treatment of hematologic disorders. Here, we provide an overview of the current landscape of biological data resources and the application of machine learning frameworks pertinent to hematology research.</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143560398","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Single-cell Atlas of Developing Mouse Palates Reveals Cellular and Molecular Transitions in Periderm Cell Fate.
Pub Date : 2025-03-04 DOI: 10.1093/gpbjnl/qzaf013
Wenbin Huang, Zhenwei Qian, Jieni Zhang, Yi Ding, Bin Wang, Jiuxiang Lin, Xiannian Zhang, Huaxiang Zhao, Feng Chen

Cleft palate is one of the most common congenital craniofacial disorders that affects children's appearance and oral functions. Investigating the transcriptomics during palatogenesis is crucial for comprehending the etiology of this disorder and facilitating prenatal molecular diagnosis. However, there is limited knowledge about the single-cell differentiation dynamics during mid-palatogenesis and late-palatogenesis, specifically regarding the subpopulations and developmental trajectories of periderm, a rare but critical cell population. Here we explored the single-cell landscape of mouse developing palates from embryonic day (E) 10.5 to E16.5. We systematically depicted the single-cell transcriptomics of mesenchymal and epithelial cells during palatogenesis, including subpopulations and differentiation dynamics. Additionally, we identified four subclusters of palatal periderm and constructed two distinct trajectories of cell fates for periderm cells. Our findings reveal that claudin-family coding genes and Arhgap29 play a role in the non-stick function of the periderm before the palatal shelves contact, and Pitx2 mediates the adhesion of periderm during the contact of opposing palatal shelves. Furthermore, we demonstrated that epithelial-mesenchymal transition (EMT), apoptosis, and migration collectively contribute to the degeneration of periderm cells in the medial epithelial seam. Taken together, our study suggests a novel model of periderm development during palatogenesis and delineates the cellular and molecular transitions in periderm cell determination.

{"title":"Single-cell Atlas of Developing Mouse Palates Reveals Cellular and Molecular Transitions in Periderm Cell Fate.","authors":"Wenbin Huang, Zhenwei Qian, Jieni Zhang, Yi Ding, Bin Wang, Jiuxiang Lin, Xiannian Zhang, Huaxiang Zhao, Feng Chen","doi":"10.1093/gpbjnl/qzaf013","DOIUrl":"https://doi.org/10.1093/gpbjnl/qzaf013","url":null,"abstract":"<p><p>Cleft palate is one of the most common congenital craniofacial disorders that affects children's appearance and oral functions. Investigating the transcriptomics during palatogenesis is crucial for comprehending the etiology of this disorder and facilitating prenatal molecular diagnosis. However, there is limited knowledge about the single-cell differentiation dynamics during mid-palatogenesis and late-palatogenesis, specifically regarding the subpopulations and developmental trajectories of periderm, a rare but critical cell population. Here we explored the single-cell landscape of mouse developing palates from embryonic day (E) 10.5 to E16.5. We systematically depicted the single-cell transcriptomics of mesenchymal and epithelial cells during palatogenesis, including subpopulations and differentiation dynamics. Additionally, we identified four subclusters of palatal periderm and constructed two distinct trajectories of cell fates for periderm cells. Our findings reveal that claudin-family coding genes and Arhgap29 play a role in the non-stick function of the periderm before the palatal shelves contact, and Pitx2 mediates the adhesion of periderm during the contact of opposing palatal shelves. Furthermore, we demonstrated that epithelial-mesenchymal transition (EMT), apoptosis, and migration collectively contribute to the degeneration of periderm cells in the medial epithelial seam. Taken together, our study suggests a novel model of periderm development during palatogenesis and delineates the cellular and molecular transitions in periderm cell determination.</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143560327","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
PhaSeDis: A Manually Curated Database of Phase Separation-Disease Associations and Corresponding Small Molecules.
Pub Date : 2025-03-04 DOI: 10.1093/gpbjnl/qzaf014
Taoyu Chen, Guoguo Tang, Tianhao Li, Zhining Yanghong, Chao Hou, Zezhou Du, Kaiqiang You, Liwei Ma, Tingting Li

Biomacromolecules form membraneless organelles through liquid-liquid phase separation in order to regulate the efficiency of particular biochemical reactions. Dysregulation of phase separation might result in pathological condensation or sequestration of biomolecules, leading to diseases. Thus, phase separation and phase separating factors may serve as drug targets for disease treatment. Nevertheless, such associations have not yet been integrated into phase separation related databases. Therefore, based on MloDisDB, a database for membraneless organelle factor-disease association previously developed by our lab, we constructed PhaSeDis, the phase separation-disease association database. We increased the number of phase separation entries from 52 to 185, and supplemented the evidence provided by the original article verifying the phase separation nature of the factors. Moreover, we included the information of interacting small molecules with low-throughput or high-throughput evidence that might serve as potential drugs for phase separation entries. PhaSeDis strives to offer comprehensive descriptions of each entry, elucidating how phase separating factors induce pathological conditions via phase separation and the mechanisms by which small molecules intervene. We believe that PhaSeDis would be very important in the application of phase separation regulation in treating related diseases. PhaSeDis is available at http://mlodis.phasep.pro.

{"title":"PhaSeDis: A Manually Curated Database of Phase Separation-Disease Associations and Corresponding Small Molecules.","authors":"Taoyu Chen, Guoguo Tang, Tianhao Li, Zhining Yanghong, Chao Hou, Zezhou Du, Kaiqiang You, Liwei Ma, Tingting Li","doi":"10.1093/gpbjnl/qzaf014","DOIUrl":"https://doi.org/10.1093/gpbjnl/qzaf014","url":null,"abstract":"<p><p>Biomacromolecules form membraneless organelles through liquid-liquid phase separation in order to regulate the efficiency of particular biochemical reactions. Dysregulation of phase separation might result in pathological condensation or sequestration of biomolecules, leading to diseases. Thus, phase separation and phase separating factors may serve as drug targets for disease treatment. Nevertheless, such associations have not yet been integrated into phase separation related databases. Therefore, based on MloDisDB, a database for membraneless organelle factor-disease association previously developed by our lab, we constructed PhaSeDis, the phase separation-disease association database. We increased the number of phase separation entries from 52 to 185, and supplemented the evidence provided by the original article verifying the phase separation nature of the factors. Moreover, we included the information of interacting small molecules with low-throughput or high-throughput evidence that might serve as potential drugs for phase separation entries. PhaSeDis strives to offer comprehensive descriptions of each entry, elucidating how phase separating factors induce pathological conditions via phase separation and the mechanisms by which small molecules intervene. We believe that PhaSeDis would be very important in the application of phase separation regulation in treating related diseases. PhaSeDis is available at http://mlodis.phasep.pro.</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143560320","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Developmental Gene Expression Atlas Reveals Novel Biological Basis of Complex Phenotypes in Sheep. 发育基因表达图谱揭示绵羊复杂表型的新生物学基础
Pub Date : 2025-03-04 DOI: 10.1093/gpbjnl/qzaf020
Bingru Zhao, Hanpeng Luo, Xuefeng Fu, Guoming Zhang, Emily L Clark, Feng Wang, Brian Paul Dalrymple, V Hutton Oddy, Philip E Vercoe, Cuiling Wu, George E Liu, Cong-Jun Li, Ruidong Xiang, Kechuan Tian, Yanli Zhang, Lingzhao Fang

Sheep (Ovis aries) represents one of the most important livestock species for animal protein and wool production worldwide. However, little is known about the genetic and biological basis of ovine phenotypes, particularly for those of high economic value and environmental impact. Here, by integrating 1413 RNA-seq samples from 51 distinct tissues across 14 developmental time points, representing early prenatal, late prenatal, neonate, lamb, juvenile, adult, and elderly stages, we built a high-resolution developmental Gene Expression Atlas (dGEA) in sheep. We observed dynamic patterns of gene expression and regulatory networks across tissues and developmental stages. When harnessing this resource for interpreting genetic associations of 48 monogenetic and 12 complex traits in sheep, we found that genes upregulated at prenatal developmental stages played more important roles in shaping these phenotypes than those upregulated at postnatal stages. For instance, genetic associations of crimp number, mean staple length (MSL), and individual birth weight were significantly enriched in the prenatal rather than postnatal skin and immune tissues. By comprehensively integrating GWAS fine-mapping results and the sheep dGEA, we proposed several candidate genes for complex traits in sheep, such as SOX9 for MSL, GNRHR for litter size at birth, and PRKDC for live weight. These results provide novel insights into the developmental and molecular architecture underlying ovine phenotypes. The dGEA (https://sheepdgea.njau.edu.cn/) will serve as an invaluable resource for sheep developmental biology, genetics, genomics, and selective breeding.

{"title":"A Developmental Gene Expression Atlas Reveals Novel Biological Basis of Complex Phenotypes in Sheep.","authors":"Bingru Zhao, Hanpeng Luo, Xuefeng Fu, Guoming Zhang, Emily L Clark, Feng Wang, Brian Paul Dalrymple, V Hutton Oddy, Philip E Vercoe, Cuiling Wu, George E Liu, Cong-Jun Li, Ruidong Xiang, Kechuan Tian, Yanli Zhang, Lingzhao Fang","doi":"10.1093/gpbjnl/qzaf020","DOIUrl":"https://doi.org/10.1093/gpbjnl/qzaf020","url":null,"abstract":"<p><p>Sheep (Ovis aries) represents one of the most important livestock species for animal protein and wool production worldwide. However, little is known about the genetic and biological basis of ovine phenotypes, particularly for those of high economic value and environmental impact. Here, by integrating 1413 RNA-seq samples from 51 distinct tissues across 14 developmental time points, representing early prenatal, late prenatal, neonate, lamb, juvenile, adult, and elderly stages, we built a high-resolution developmental Gene Expression Atlas (dGEA) in sheep. We observed dynamic patterns of gene expression and regulatory networks across tissues and developmental stages. When harnessing this resource for interpreting genetic associations of 48 monogenetic and 12 complex traits in sheep, we found that genes upregulated at prenatal developmental stages played more important roles in shaping these phenotypes than those upregulated at postnatal stages. For instance, genetic associations of crimp number, mean staple length (MSL), and individual birth weight were significantly enriched in the prenatal rather than postnatal skin and immune tissues. By comprehensively integrating GWAS fine-mapping results and the sheep dGEA, we proposed several candidate genes for complex traits in sheep, such as SOX9 for MSL, GNRHR for litter size at birth, and PRKDC for live weight. These results provide novel insights into the developmental and molecular architecture underlying ovine phenotypes. The dGEA (https://sheepdgea.njau.edu.cn/) will serve as an invaluable resource for sheep developmental biology, genetics, genomics, and selective breeding.</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143560395","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
PIGOME: An Integrated and Comprehensive Multi-omics Database for Pig Functional Genomics Studies.
Pub Date : 2025-02-28 DOI: 10.1093/gpbjnl/qzaf016
Guohao Han, Peng Yang, Yongjin Zhang, Qiaowei Li, Xinhao Fan, Ruipu Chen, Chao Yan, Mu Zeng, Yalan Yang, Zhonglin Tang

In addition to being a major source of animal protein, pigs are an important model for the study of development and diseases in humans. During the past two decades, thousands of high-throughput sequencing studies in pigs have been performed using a variety of tissues from different breeds and developmental stages. However, the multi-omics database specifically used for pig functional genomic research is still limited. Here, we present a user-friendly database of pig multi-omics named PIGOME. PIGOME currently contains seven types of pig omics datasets, including whole-genome sequencing (WGS), RNA sequencing (RNA-seq), microRNA sequencing (miRNA-seq), chromatin immunoprecipitation sequencing (ChIP-seq), assay for transposase-accessible chromatin sequencing (ATAC-seq), bisulfite sequencing (BS-seq), and methylated RNA immunoprecipitation sequencing (MeRIP-seq), from 6901 samples and 392 projects with manually curated metadata, integrated gene annotation, and quantitative trait locus information. Furthermore, various "Explore" and "Browse" functions have been established for user-friendly access to omics information. PIGOME implemented several tools to visualize genomic variants, gene expression, and epigenetic signals of a given gene in the pig genome, enabling efficient exploration of spatial-temporal gene expression/epigenetic pattern, function, regulatory mechanism, and associated economic traits. Collectively, PIGOME provides valuable resources for pig breeding and is helpful for human biomedical research. PIGOME is available at https://pigome.com.

{"title":"PIGOME: An Integrated and Comprehensive Multi-omics Database for Pig Functional Genomics Studies.","authors":"Guohao Han, Peng Yang, Yongjin Zhang, Qiaowei Li, Xinhao Fan, Ruipu Chen, Chao Yan, Mu Zeng, Yalan Yang, Zhonglin Tang","doi":"10.1093/gpbjnl/qzaf016","DOIUrl":"https://doi.org/10.1093/gpbjnl/qzaf016","url":null,"abstract":"<p><p>In addition to being a major source of animal protein, pigs are an important model for the study of development and diseases in humans. During the past two decades, thousands of high-throughput sequencing studies in pigs have been performed using a variety of tissues from different breeds and developmental stages. However, the multi-omics database specifically used for pig functional genomic research is still limited. Here, we present a user-friendly database of pig multi-omics named PIGOME. PIGOME currently contains seven types of pig omics datasets, including whole-genome sequencing (WGS), RNA sequencing (RNA-seq), microRNA sequencing (miRNA-seq), chromatin immunoprecipitation sequencing (ChIP-seq), assay for transposase-accessible chromatin sequencing (ATAC-seq), bisulfite sequencing (BS-seq), and methylated RNA immunoprecipitation sequencing (MeRIP-seq), from 6901 samples and 392 projects with manually curated metadata, integrated gene annotation, and quantitative trait locus information. Furthermore, various \"Explore\" and \"Browse\" functions have been established for user-friendly access to omics information. PIGOME implemented several tools to visualize genomic variants, gene expression, and epigenetic signals of a given gene in the pig genome, enabling efficient exploration of spatial-temporal gene expression/epigenetic pattern, function, regulatory mechanism, and associated economic traits. Collectively, PIGOME provides valuable resources for pig breeding and is helpful for human biomedical research. PIGOME is available at https://pigome.com.</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-02-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143560323","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
LigExtract: Large-scale Automated Identification of Ligands from Protein Structures in the Protein Data Bank.
Pub Date : 2025-02-28 DOI: 10.1093/gpbjnl/qzaf018
Natália Aniceto, Nuno Martinho, Ismael Rufino, Rita C Guedes

The Protein Data Bank is an ever-growing database of 3D macromolecular structures that has become a crucial resource for the drug discovery process. Exploring complexed proteins and accessing the ligands in these proteins is paramount to help researchers understand biological processes and design new compounds of pharmaceutical interest. However, currently available tools to perform large-scale ligand identification do not address many of the more complex ways in which ligands are stored and represented in PDB structures. Therefore, a new tool called LigExtract was specifically developed for the large-scale processing of PDB structures and the identification of their ligands. This is a fully open-source tool available to the scientific community, designed to provide end-to-end processing whereby the user simply provides a list of UniProt IDs and LigExtract returns a list of ligands, their individual PDB files, a PDB file of the protein chains engaged with the ligand and a series of log files that inform the user of the decisions made during the ligand extraction process as well as potential flagging of additional scenarios that might have to be considered during any follow-up use of the processed files (e.g., ligands covalently bound to the protein). LigExtract is available, open-source, on GitHub (https://github.com/comp-medchem/LigExtract).

{"title":"LigExtract: Large-scale Automated Identification of Ligands from Protein Structures in the Protein Data Bank.","authors":"Natália Aniceto, Nuno Martinho, Ismael Rufino, Rita C Guedes","doi":"10.1093/gpbjnl/qzaf018","DOIUrl":"https://doi.org/10.1093/gpbjnl/qzaf018","url":null,"abstract":"<p><p>The Protein Data Bank is an ever-growing database of 3D macromolecular structures that has become a crucial resource for the drug discovery process. Exploring complexed proteins and accessing the ligands in these proteins is paramount to help researchers understand biological processes and design new compounds of pharmaceutical interest. However, currently available tools to perform large-scale ligand identification do not address many of the more complex ways in which ligands are stored and represented in PDB structures. Therefore, a new tool called LigExtract was specifically developed for the large-scale processing of PDB structures and the identification of their ligands. This is a fully open-source tool available to the scientific community, designed to provide end-to-end processing whereby the user simply provides a list of UniProt IDs and LigExtract returns a list of ligands, their individual PDB files, a PDB file of the protein chains engaged with the ligand and a series of log files that inform the user of the decisions made during the ligand extraction process as well as potential flagging of additional scenarios that might have to be considered during any follow-up use of the processed files (e.g., ligands covalently bound to the protein). LigExtract is available, open-source, on GitHub (https://github.com/comp-medchem/LigExtract).</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-02-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143560407","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Challenges in AI-driven Biomedical Multimodal Data Fusion and Analysis.
Pub Date : 2025-02-27 DOI: 10.1093/gpbjnl/qzaf011
Junwei Liu, Xiaoping Cen, Chenxin Yi, Feng-Ao Wang, Junxiang Ding, Jinyu Cheng, Qinhua Wu, Baowen Gai, Yiwen Zhou, Ruikun He, Feng Gao, Yixue Li

The rapid development of biological and medical examination methods has vastly expanded personal biomedical information, including molecular, cellular, image, and electronic health record datasets. Integrating this wealth of information enables precise disease diagnosis, biomarker identification, and treatment design in clinical settings. Artificial intelligence (AI) techniques, particularly deep learning models, have been extensively employed in biomedical applications, demonstrating increased precision, efficiency, and generalization. The success of the large language and vision models further significantly extends their biomedical applications. However, challenges remain in learning these multimodal biomedical datasets, such as data privacy, fusion, and model interpretation. In this review, we provided a comprehensive overview of various biomedical data modalities, multi-modal representation learning methods, and the applications of AI in biomedical data integrative analysis. Additionally, we discussed the challenges in applying these deep learning methods and how to better integrate them into biomedical scenarios. We then proposed future directions for adapting deep learning methods with model pre-training and knowledge integration to advance biomedical research and benefit their clinical applications.

{"title":"Challenges in AI-driven Biomedical Multimodal Data Fusion and Analysis.","authors":"Junwei Liu, Xiaoping Cen, Chenxin Yi, Feng-Ao Wang, Junxiang Ding, Jinyu Cheng, Qinhua Wu, Baowen Gai, Yiwen Zhou, Ruikun He, Feng Gao, Yixue Li","doi":"10.1093/gpbjnl/qzaf011","DOIUrl":"https://doi.org/10.1093/gpbjnl/qzaf011","url":null,"abstract":"<p><p>The rapid development of biological and medical examination methods has vastly expanded personal biomedical information, including molecular, cellular, image, and electronic health record datasets. Integrating this wealth of information enables precise disease diagnosis, biomarker identification, and treatment design in clinical settings. Artificial intelligence (AI) techniques, particularly deep learning models, have been extensively employed in biomedical applications, demonstrating increased precision, efficiency, and generalization. The success of the large language and vision models further significantly extends their biomedical applications. However, challenges remain in learning these multimodal biomedical datasets, such as data privacy, fusion, and model interpretation. In this review, we provided a comprehensive overview of various biomedical data modalities, multi-modal representation learning methods, and the applications of AI in biomedical data integrative analysis. Additionally, we discussed the challenges in applying these deep learning methods and how to better integrate them into biomedical scenarios. We then proposed future directions for adapting deep learning methods with model pre-training and knowledge integration to advance biomedical research and benefit their clinical applications.</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-02-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143560401","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Evaluative Methodology for HRD Testing: Development of Standard Tools for Consistency Assessment.
Pub Date : 2025-02-27 DOI: 10.1093/gpbjnl/qzaf017
Zheng Jia, Yaqing Liu, Shoufang Qu, Wenbin Li, Lin Gao, Lin Dong, Yun Xing, Yadi Cheng, Huan Fang, Yuting Yi, Yuxing Chu, Chao Zhang, Yanming Xie, Chunli Wang, Zhe Li, Zhihong Zhang, Zhipeng Xu, Yang Wang, Wenxin Zhang, Xiaoping Gu, Shuang Yang, Jinghua Li, Liangshen Wei, Yuanting Zheng, Guohui Ding, Leming Shi, Xin Yi, Jianming Ying, Jie Huang

Homologous recombination deficiency (HRD) has emerged as a critical prognostic and predictive biomarker in oncology. However, current testing methods, especially those reliant on targeted panels, are plagued by inconsistent results from the same samples. This highlights the urgent need for standardized benchmarks to evaluate HRD assay performance. In phases IIa and IIb of the Chinese HRD Harmonization Project, we developed ten pairs of well-characterized DNA reference materials derived from lung, breast, and melanoma cancer cell lines and their matched normal cell lines, each paired with seven cancer-to-normal mass ratios. Reference datasets for allele-specific copy number variations (ASCNVs) and HRD scores were established and validated based on three sequencing methods and nine analytical pipelines. The Genomic Instability Scores (GIS) of the reference materials ranged from 11 to 96, enabling validation across various thresholds. The ASCNV reference datasets covered a genomic span of 2340 to 2749 Mb, equivalent to 81.2% to 95.4% of the autosomes in the 37d5 reference genome. These benchmarks were subsequently utilized to assess the accuracy and reproducibility of four HRD panel assays, revealing significant variability in both ASCNV detection and HRD scores. The concordance between panel-detected GIS and reference GIS ranged from 0.81 to 0.94, and only two assays exhibited high overall agreement with Myriad MyChoice CDx for HRD classification. This study also identified specific challenges in ASCNV detection in HRD-related regions and the profound impact of high ploidy on consistency. The established HRD reference materials and datasets provide a robust toolkit for objective evaluation of HRD testing.

{"title":"Evaluative Methodology for HRD Testing: Development of Standard Tools for Consistency Assessment.","authors":"Zheng Jia, Yaqing Liu, Shoufang Qu, Wenbin Li, Lin Gao, Lin Dong, Yun Xing, Yadi Cheng, Huan Fang, Yuting Yi, Yuxing Chu, Chao Zhang, Yanming Xie, Chunli Wang, Zhe Li, Zhihong Zhang, Zhipeng Xu, Yang Wang, Wenxin Zhang, Xiaoping Gu, Shuang Yang, Jinghua Li, Liangshen Wei, Yuanting Zheng, Guohui Ding, Leming Shi, Xin Yi, Jianming Ying, Jie Huang","doi":"10.1093/gpbjnl/qzaf017","DOIUrl":"https://doi.org/10.1093/gpbjnl/qzaf017","url":null,"abstract":"<p><p>Homologous recombination deficiency (HRD) has emerged as a critical prognostic and predictive biomarker in oncology. However, current testing methods, especially those reliant on targeted panels, are plagued by inconsistent results from the same samples. This highlights the urgent need for standardized benchmarks to evaluate HRD assay performance. In phases IIa and IIb of the Chinese HRD Harmonization Project, we developed ten pairs of well-characterized DNA reference materials derived from lung, breast, and melanoma cancer cell lines and their matched normal cell lines, each paired with seven cancer-to-normal mass ratios. Reference datasets for allele-specific copy number variations (ASCNVs) and HRD scores were established and validated based on three sequencing methods and nine analytical pipelines. The Genomic Instability Scores (GIS) of the reference materials ranged from 11 to 96, enabling validation across various thresholds. The ASCNV reference datasets covered a genomic span of 2340 to 2749 Mb, equivalent to 81.2% to 95.4% of the autosomes in the 37d5 reference genome. These benchmarks were subsequently utilized to assess the accuracy and reproducibility of four HRD panel assays, revealing significant variability in both ASCNV detection and HRD scores. The concordance between panel-detected GIS and reference GIS ranged from 0.81 to 0.94, and only two assays exhibited high overall agreement with Myriad MyChoice CDx for HRD classification. This study also identified specific challenges in ASCNV detection in HRD-related regions and the profound impact of high ploidy on consistency. The established HRD reference materials and datasets provide a robust toolkit for objective evaluation of HRD testing.</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-02-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143560404","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MS-based Solutions for Single Cell Proteomics. 基于 MS 的单细胞蛋白质组学解决方案。
Pub Date : 2025-02-22 DOI: 10.1093/gpbjnl/qzaf012
Siqi Li, Shuwei Li, Siqi Liu, Yan Ren

Mass spectrometry-based single cell proteomics (MS-SCP) is attracting tremendous attention because it is now technically feasible to quantify thousands of proteins in minute samples. Since protein amplification is still not possible, technological improvements in MS-SCP focus on minimizing sample loss and increasing throughput, resolution, and sensitivity, as well as achieving the measurement depth, accuracy, and stability as bulk samples. Major advances in MS-SCP have facilitated its use in biological and even medical applications. Here, we review the key advancements in MS-SCP technology and discuss the strategies of the classic proteomics workflow to improve MS-SCP analysis from single cell isolation, sample preparation and liquid chromatography separation to MS data acquisition and analysis. The review will provide an overall understanding of the development and application of MS-SCP and inspire more novel ideas regarding the innovation of MS-SCP technology.

{"title":"MS-based Solutions for Single Cell Proteomics.","authors":"Siqi Li, Shuwei Li, Siqi Liu, Yan Ren","doi":"10.1093/gpbjnl/qzaf012","DOIUrl":"https://doi.org/10.1093/gpbjnl/qzaf012","url":null,"abstract":"<p><p>Mass spectrometry-based single cell proteomics (MS-SCP) is attracting tremendous attention because it is now technically feasible to quantify thousands of proteins in minute samples. Since protein amplification is still not possible, technological improvements in MS-SCP focus on minimizing sample loss and increasing throughput, resolution, and sensitivity, as well as achieving the measurement depth, accuracy, and stability as bulk samples. Major advances in MS-SCP have facilitated its use in biological and even medical applications. Here, we review the key advancements in MS-SCP technology and discuss the strategies of the classic proteomics workflow to improve MS-SCP analysis from single cell isolation, sample preparation and liquid chromatography separation to MS data acquisition and analysis. The review will provide an overall understanding of the development and application of MS-SCP and inspire more novel ideas regarding the innovation of MS-SCP technology.</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143477009","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The Updated Genome Warehouse: Enhancing Data Value, Security, and Usability to Address Data Expansion.
Pub Date : 2025-02-20 DOI: 10.1093/gpbjnl/qzaf010
Yingke Ma, Xuetong Zhao, Yaokai Jia, Zhenxian Han, Caixia Yu, Zhuojing Fan, Zhang Zhang, Jingfa Xiao, Wenming Zhao, Yiming Bao, Meili Chen

The Genome Warehouse (GWH), accessible at https://ngdc.cncb.ac.cn/gwh, is an extensively utilized public repository dedicated to the deposition, management and sharing of genome assembly sequences, annotations, and metadata. This paper highlights noteworthy enhancements to the GWH since the 2021 version, emphasizing substantial advancements in web interfaces for data submission, database functionality updates, and resource integration. Key updates include the reannotation of released prokaryotic genomes, mirroring of genome resources from National Center for Biotechnology Information (NCBI) GenBank and Reference Sequence Database (RefSeq), integration of Poxviridae sequences, implementation of an online batch submission system, enhancements to the quality control system, advanced search capabilities, and the introduction of a controlled-access mechanism for human genome data. These improvements collectively augment the ease and security of data submission and access as well as genome data value, thereby fostering heightened convenience and utility for researchers in the genomics field.

基因组仓库(GWH)可通过 https://ngdc.cncb.ac.cn/gwh 访问,它是一个广泛使用的公共存储库,专门用于存放、管理和共享基因组组装序列、注释和元数据。本文重点介绍了 GWH 自 2021 年版本以来值得关注的改进,强调了在数据提交网络接口、数据库功能更新和资源整合方面的重大进展。主要更新包括对已发布的原核生物基因组进行重新注释、镜像美国国家生物技术信息中心(NCBI)GenBank 和参考序列数据库(RefSeq)的基因组资源、整合 Poxviridae 序列、实施在线批量提交系统、增强质量控制系统、高级搜索功能以及引入人类基因组数据受控访问机制。这些改进共同提高了数据提交和访问的便捷性和安全性以及基因组数据的价值,从而为基因组学领域的研究人员提供了更大的便利和效用。
{"title":"The Updated Genome Warehouse: Enhancing Data Value, Security, and Usability to Address Data Expansion.","authors":"Yingke Ma, Xuetong Zhao, Yaokai Jia, Zhenxian Han, Caixia Yu, Zhuojing Fan, Zhang Zhang, Jingfa Xiao, Wenming Zhao, Yiming Bao, Meili Chen","doi":"10.1093/gpbjnl/qzaf010","DOIUrl":"https://doi.org/10.1093/gpbjnl/qzaf010","url":null,"abstract":"<p><p>The Genome Warehouse (GWH), accessible at https://ngdc.cncb.ac.cn/gwh, is an extensively utilized public repository dedicated to the deposition, management and sharing of genome assembly sequences, annotations, and metadata. This paper highlights noteworthy enhancements to the GWH since the 2021 version, emphasizing substantial advancements in web interfaces for data submission, database functionality updates, and resource integration. Key updates include the reannotation of released prokaryotic genomes, mirroring of genome resources from National Center for Biotechnology Information (NCBI) GenBank and Reference Sequence Database (RefSeq), integration of Poxviridae sequences, implementation of an online batch submission system, enhancements to the quality control system, advanced search capabilities, and the introduction of a controlled-access mechanism for human genome data. These improvements collectively augment the ease and security of data submission and access as well as genome data value, thereby fostering heightened convenience and utility for researchers in the genomics field.</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143470397","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Genomics, proteomics & bioinformatics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1