Pei-Hong Zhang, Hua Feng, Xu-Kai Ma, Fang Nan, Li Yang
Polyadenylation site (PAS) selection plays important roles in gene expression regulation and function. RNA-seq data derived from 3' tag sequencing contain intrinsic information about PAS usage and have been analyzed for alternative polyadenylation (APA) isoform expression in both bulk and single cell samples. Here, we upgraded our previously developed deep learning-based PAS analysis pipeline SCAPTURE v2 to profile PASs from 1330 published 3' tag-based scRNA-seq datasets across seven species, resulting in a comprehensive PAS landscape across species. Validation with long-read sequencing data from matched human tissues showed high accuracy of single-cell PAS profiling by SCAPTURE, including previously unannotated ones. Further comparisons revealed distinct PAS usage preferences in different species, such as human versus mouse, independent of conservation of gene expression. Finally, we present PASSpedia, a comprehensive database for PAS analysis and comparison across seven species at single cell resolution, which is freely accessible online at https://bits.fudan.edu.cn/PASSpedia/.
{"title":"PASSpedia: A Polyadenylation Site Database Across Different Species at Single Cell Resolution.","authors":"Pei-Hong Zhang, Hua Feng, Xu-Kai Ma, Fang Nan, Li Yang","doi":"10.1093/gpbjnl/qzaf089","DOIUrl":"https://doi.org/10.1093/gpbjnl/qzaf089","url":null,"abstract":"<p><p>Polyadenylation site (PAS) selection plays important roles in gene expression regulation and function. RNA-seq data derived from 3' tag sequencing contain intrinsic information about PAS usage and have been analyzed for alternative polyadenylation (APA) isoform expression in both bulk and single cell samples. Here, we upgraded our previously developed deep learning-based PAS analysis pipeline SCAPTURE v2 to profile PASs from 1330 published 3' tag-based scRNA-seq datasets across seven species, resulting in a comprehensive PAS landscape across species. Validation with long-read sequencing data from matched human tissues showed high accuracy of single-cell PAS profiling by SCAPTURE, including previously unannotated ones. Further comparisons revealed distinct PAS usage preferences in different species, such as human versus mouse, independent of conservation of gene expression. Finally, we present PASSpedia, a comprehensive database for PAS analysis and comparison across seven species at single cell resolution, which is freely accessible online at https://bits.fudan.edu.cn/PASSpedia/.</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":7.9,"publicationDate":"2025-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145133198","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zhenyu Huang, Xuechen Mu, Qiufen Chen, Lingli Zhong, Jun Xiao, Chunman Zuo, Ye Zhang, Bocheng Shi, Yingwei Qu, Renbo Tan, Long Xu, Renchu Guan, Ying Xu
Intracellular alkalosis and extracellular acidosis are well-established characteristics of Alzheimer's disease (AD). We present a computational analysis and modeling of transcriptomic data of AD tissues, aiming to understand their causes and consequences. Our analyses have revealed that (1) persistent mitochondrial alkalization is due to chronic inflammation coupled with elevated iron and copper metabolisms; (2) the affected cells activate multiple acid-producing metabolisms to keep the mitochondrial pH stable for survival; (3) the most significant one is the continuous import and hydrolysis of glutamine to glutamate, NH3 and H+, resulting in persistent release of glutamates, an excitatory neurotransmitter, into the extracellular space; (4) this leads to persistent hyperexcitability of the nearby neurons, resulting in their continuous firing and release of H+-rich synaptic vesicles; (5) these H+s are neutralized by bicarbonates released by the neighboring astrocytes in normal tissues, which could not keep up with the increased H+-release in their discharge rates of bicarbonates in AD tissues, leading to progressively increased extracellular acidosis and ultimately cell death; and (6) multiple extensively studied AD-associated phenotypes, including Aβ aggregates and Tau fibers, are induced to help to alleviate the pH imbalances and beneficial to cell survival in the early phase of AD, which gradually become contributors to the AD development. Each step in this model is largely supported by published studies. Overall, we have developed a fundamentally novel and systems-level view of how AD may have developed.
{"title":"A Model for the Development of Alzheimer's Disease.","authors":"Zhenyu Huang, Xuechen Mu, Qiufen Chen, Lingli Zhong, Jun Xiao, Chunman Zuo, Ye Zhang, Bocheng Shi, Yingwei Qu, Renbo Tan, Long Xu, Renchu Guan, Ying Xu","doi":"10.1093/gpbjnl/qzaf087","DOIUrl":"https://doi.org/10.1093/gpbjnl/qzaf087","url":null,"abstract":"<p><p>Intracellular alkalosis and extracellular acidosis are well-established characteristics of Alzheimer's disease (AD). We present a computational analysis and modeling of transcriptomic data of AD tissues, aiming to understand their causes and consequences. Our analyses have revealed that (1) persistent mitochondrial alkalization is due to chronic inflammation coupled with elevated iron and copper metabolisms; (2) the affected cells activate multiple acid-producing metabolisms to keep the mitochondrial pH stable for survival; (3) the most significant one is the continuous import and hydrolysis of glutamine to glutamate, NH3 and H+, resulting in persistent release of glutamates, an excitatory neurotransmitter, into the extracellular space; (4) this leads to persistent hyperexcitability of the nearby neurons, resulting in their continuous firing and release of H+-rich synaptic vesicles; (5) these H+s are neutralized by bicarbonates released by the neighboring astrocytes in normal tissues, which could not keep up with the increased H+-release in their discharge rates of bicarbonates in AD tissues, leading to progressively increased extracellular acidosis and ultimately cell death; and (6) multiple extensively studied AD-associated phenotypes, including Aβ aggregates and Tau fibers, are induced to help to alleviate the pH imbalances and beneficial to cell survival in the early phase of AD, which gradually become contributors to the AD development. Each step in this model is largely supported by published studies. Overall, we have developed a fundamentally novel and systems-level view of how AD may have developed.</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":7.9,"publicationDate":"2025-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145126891","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Natália Aniceto, Nuno Martinho, Ismael Rufino, Rita C Guedes
The Protein Data Bank (PDB) is an ever-growing database of three-dimensional macromolecular structures that has become a crucial resource for the drug discovery process. Exploring complexed proteins and accessing their associated ligands are essential for researchers to understand biological processes and design new compounds of pharmaceutical interest. However, currently available tools for large-scale ligand identification fail to address many of the more complex ways in which ligands are stored and represented in PDB structures. Therefore, a new tool called LigExtract was specifically developed for the large-scale processing of PDB structures and the identification of their ligands. This is a fully open-source tool available to the scientific community, designed to provide end-to-end processing. Users simply provide a list of UniProt IDs, and LigExtract returns a list of ligands, their individual PDB files, a PDB file of the protein chains interacting with the ligand, and a series of log files. These logs record the decisions made during the ligand extraction process and flag additional scenarios that might have to be considered during any follow-up use of the processed files (e.g., ligands covalently bound to the protein). LigExtract is freely available on GitHub (https://github.com/comp-medchem/LigExtract).
{"title":"LigExtract: Large-scale Automated Identification of Ligands from Protein Structures in the Protein Data Bank.","authors":"Natália Aniceto, Nuno Martinho, Ismael Rufino, Rita C Guedes","doi":"10.1093/gpbjnl/qzaf018","DOIUrl":"10.1093/gpbjnl/qzaf018","url":null,"abstract":"<p><p>The Protein Data Bank (PDB) is an ever-growing database of three-dimensional macromolecular structures that has become a crucial resource for the drug discovery process. Exploring complexed proteins and accessing their associated ligands are essential for researchers to understand biological processes and design new compounds of pharmaceutical interest. However, currently available tools for large-scale ligand identification fail to address many of the more complex ways in which ligands are stored and represented in PDB structures. Therefore, a new tool called LigExtract was specifically developed for the large-scale processing of PDB structures and the identification of their ligands. This is a fully open-source tool available to the scientific community, designed to provide end-to-end processing. Users simply provide a list of UniProt IDs, and LigExtract returns a list of ligands, their individual PDB files, a PDB file of the protein chains interacting with the ligand, and a series of log files. These logs record the decisions made during the ligand extraction process and flag additional scenarios that might have to be considered during any follow-up use of the processed files (e.g., ligands covalently bound to the protein). LigExtract is freely available on GitHub (https://github.com/comp-medchem/LigExtract).</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":7.9,"publicationDate":"2025-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12619641/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143560407","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Chengchao Wu, Tianshu Zhou, Wenfu Ke, Wei Xiong, Zhihui Zhang, Siheng Zhang, Jinyue Wang, Lulu Deng, Keji Yan, Man Wang, Shenglong He, Qi Gong, Chao Ma, Xiaping Chen, Yan Li, He Long, Chong Guo, Gang Cao, Zhijun Zhang
For chromosome abnormalities (CAs), such as Down syndrome (DS), the influence of genomic variations on chromosome conformation and gene transcription remains elusive. Based on the complete genomic sequences from the parents of a DS trisomy patient, we systematically delineated an atlas of parental-specific, haplotype-resolved single nucleotide polymorphisms (SNPs), copy number variations (CNVs), three-dimensional (3D) genome architecture, and RNA expression profiles in the diencephalon of the DS patient. The integrated haplotype-resolved multi-omics analysis demonstrated that one-dimensional (1D) genomic variations including SNPs and CNVs in the DS patient are highly correlated with the alterations in the 3D genome organization and the subsequent changes in gene transcription. This correlation remains valid at the haplotype level. Moreover, we revealed the 3D genome alteration-associated dysregulation of DS-related genes, which facilitates understanding the pathogenesis of CAs. Together, our study contributes to deciphering the coding from 1D genomic variations to 3D genome architecture and the subsequent gene transcription outcomes in both health and disease.
{"title":"Deciphering Haplotype-level Chromosome Conformation Alteration in Down Syndrome by Haplotype-resolved Multi-omics Analysis.","authors":"Chengchao Wu, Tianshu Zhou, Wenfu Ke, Wei Xiong, Zhihui Zhang, Siheng Zhang, Jinyue Wang, Lulu Deng, Keji Yan, Man Wang, Shenglong He, Qi Gong, Chao Ma, Xiaping Chen, Yan Li, He Long, Chong Guo, Gang Cao, Zhijun Zhang","doi":"10.1093/gpbjnl/qzaf054","DOIUrl":"10.1093/gpbjnl/qzaf054","url":null,"abstract":"<p><p>For chromosome abnormalities (CAs), such as Down syndrome (DS), the influence of genomic variations on chromosome conformation and gene transcription remains elusive. Based on the complete genomic sequences from the parents of a DS trisomy patient, we systematically delineated an atlas of parental-specific, haplotype-resolved single nucleotide polymorphisms (SNPs), copy number variations (CNVs), three-dimensional (3D) genome architecture, and RNA expression profiles in the diencephalon of the DS patient. The integrated haplotype-resolved multi-omics analysis demonstrated that one-dimensional (1D) genomic variations including SNPs and CNVs in the DS patient are highly correlated with the alterations in the 3D genome organization and the subsequent changes in gene transcription. This correlation remains valid at the haplotype level. Moreover, we revealed the 3D genome alteration-associated dysregulation of DS-related genes, which facilitates understanding the pathogenesis of CAs. Together, our study contributes to deciphering the coding from 1D genomic variations to 3D genome architecture and the subsequent gene transcription outcomes in both health and disease.</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":7.9,"publicationDate":"2025-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12571509/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144277053","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Research on cell type markers helps investigators explore the diverse cellular composition of gastrointestinal tumors, thereby enhancing our understanding of tumor heterogeneity and its impact on disease progression and treatment response. However, the integration of large-scale datasets and the standardization of cell type identification remain challenging. Here, we developed PreDigs, a user-friendly database of predicted signatures for the digestive system, which offers 124 curated single-cell RNA sequencing datasets, covering over 3.4 million cells, all available for download. After unsupervised clustering, we unified the identification and nomenclature of cell subtype labels, constructing a cell ontology tree with 142 cell types across 8 hierarchical levels. Meanwhile, we calculated three different context-specific cell type markers, including "Cell Markers", "Subtype Markers", and "TPN Markers", based on various application requirements within or across tissues. Through the integrated analysis of PreDigs data, we identified distinct cell subpopulations exclusive to tumors, one of which corresponds to tumor-specific endothelial cells. Additionally, PreDigs offers online cell annotation tools, allowing users to classify single cells with greater flexibility. PreDigs is accessible at https://www.biosino.org/predigs/.
{"title":"PreDigs: A Database of Context-specific Cell Type Markers and Precise Cell Subtypes for Digestive Cell Annotation.","authors":"Jiayue Meng, Mengyao Han, Yuwei Huang, Liang Li, Yuanhu Ju, Daqing Lv, Xiaoyi Chen, Liyun Yuan, Guoqing Zhang","doi":"10.1093/gpbjnl/qzaf066","DOIUrl":"10.1093/gpbjnl/qzaf066","url":null,"abstract":"<p><p>Research on cell type markers helps investigators explore the diverse cellular composition of gastrointestinal tumors, thereby enhancing our understanding of tumor heterogeneity and its impact on disease progression and treatment response. However, the integration of large-scale datasets and the standardization of cell type identification remain challenging. Here, we developed PreDigs, a user-friendly database of predicted signatures for the digestive system, which offers 124 curated single-cell RNA sequencing datasets, covering over 3.4 million cells, all available for download. After unsupervised clustering, we unified the identification and nomenclature of cell subtype labels, constructing a cell ontology tree with 142 cell types across 8 hierarchical levels. Meanwhile, we calculated three different context-specific cell type markers, including \"Cell Markers\", \"Subtype Markers\", and \"TPN Markers\", based on various application requirements within or across tissues. Through the integrated analysis of PreDigs data, we identified distinct cell subpopulations exclusive to tumors, one of which corresponds to tumor-specific endothelial cells. Additionally, PreDigs offers online cell annotation tools, allowing users to classify single cells with greater flexibility. PreDigs is accessible at https://www.biosino.org/predigs/.</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":7.9,"publicationDate":"2025-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12571502/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144839429","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Rujia Dai, Ming Zhang, Tianyao Chu, Richard Kopp, Chunling Zhang, Kefu Liu, Yue Wang, Xusheng Wang, Chao Chen, Chunyu Liu
Single-cell RNA sequencing (scRNA-seq) and single-nucleus RNA sequencing (snRNA-seq) have become essential tools for profiling gene expression across different cell types in biomedical research. While factors like RNA integrity, cell count, and sequencing depth are known to influence data quality, quantitative benchmarks and actionable guidelines are lacking. This gap contributes to variability in study designs and inconsistencies in downstream analyses. In this study, we systematically evaluated quantitative precision and accuracy in expression measures across 23 sc/snRNA-seq datasets comprising 3,682,576 cells from 339 samples. Precision was assessed using technical replicates based on pseudo-bulks created from subsampling. Accuracy was evaluated using sample-matched scRNA-seq and pooled-cell RNA sequencing data of mononuclear phagocytes from four species. Our results show that precision and accuracy are generally low at the single-cell level, with reproducibility being strongly influenced by cell count and RNA quality. We established data-driven thresholds for optimizing study design, recommending at least 500 cells per cell type per individual to achieve reliable quantification. Furthermore, we showed that signal-to-noise ratio is a key metric for identifying reproducible differentially expressed genes. To support future research, we developed Variability In single-Cell gene Expression (VICE), a tool that evaluates sc/snRNA-seq data quality and estimates the true positive rate of differential expression results based on sample size, observed noise levels, and expected effect size. These findings provide practical, evidence-based guidelines to enhance the reliability and reproducibility of sc/snRNA-seq studies.
{"title":"Precision and Accuracy in Quantitative Measurement of Gene Expression from Single-cell/nucleus RNA Sequencing Data.","authors":"Rujia Dai, Ming Zhang, Tianyao Chu, Richard Kopp, Chunling Zhang, Kefu Liu, Yue Wang, Xusheng Wang, Chao Chen, Chunyu Liu","doi":"10.1093/gpbjnl/qzaf077","DOIUrl":"10.1093/gpbjnl/qzaf077","url":null,"abstract":"<p><p>Single-cell RNA sequencing (scRNA-seq) and single-nucleus RNA sequencing (snRNA-seq) have become essential tools for profiling gene expression across different cell types in biomedical research. While factors like RNA integrity, cell count, and sequencing depth are known to influence data quality, quantitative benchmarks and actionable guidelines are lacking. This gap contributes to variability in study designs and inconsistencies in downstream analyses. In this study, we systematically evaluated quantitative precision and accuracy in expression measures across 23 sc/snRNA-seq datasets comprising 3,682,576 cells from 339 samples. Precision was assessed using technical replicates based on pseudo-bulks created from subsampling. Accuracy was evaluated using sample-matched scRNA-seq and pooled-cell RNA sequencing data of mononuclear phagocytes from four species. Our results show that precision and accuracy are generally low at the single-cell level, with reproducibility being strongly influenced by cell count and RNA quality. We established data-driven thresholds for optimizing study design, recommending at least 500 cells per cell type per individual to achieve reliable quantification. Furthermore, we showed that signal-to-noise ratio is a key metric for identifying reproducible differentially expressed genes. To support future research, we developed Variability In single-Cell gene Expression (VICE), a tool that evaluates sc/snRNA-seq data quality and estimates the true positive rate of differential expression results based on sample size, observed noise levels, and expected effect size. These findings provide practical, evidence-based guidelines to enhance the reliability and reproducibility of sc/snRNA-seq studies.</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":7.9,"publicationDate":"2025-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12603356/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144983994","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Qunhao Niu 牛群皓, Jiayuan Wu 武嘉远, Tianyi Wu 吴天弋, Tianliu Zhang 张天留, Tianzhen Wang 王添祯, Xu Zheng 郑旭, Zhida Zhao 赵志达, Ling Xu 徐玲, Zezhao Wang 王泽昭, Bo Zhu 朱波, Lupei Zhang 张路培, Huijiang Gao 高会江, George E Liu, Junya Li 李俊雅, Lingyang Xu 徐凌洋
Body weight is a polygenic trait with intricate inheritance patterns. Functional genomics enriched with multi-layer annotations offers essential resources for exploring the genetic architecture of complex traits. In this study, we conducted an extensive characterization of regulatory variants associated with body weight-related traits in cattle using multi-omics analysis. First, we identified seven candidate genes by integrating selective sweep analysis and multiple genome-wide association study (GWAS) strategies using imputed whole-genome sequencing data from a population of 1577 individuals. Subsequently, we uncovered 3340 eGenes (genes whose expression levels are associated with genetic variants) across 227 muscle samples. Transcriptome-wide association studies (TWASs) further revealed a total of 532 distinct candidate genes associated with body weight-related traits. Colocalization analyses unveiled 44 genes shared between expression quantitative trait loci (eQTLs) and GWAS signals. Moreover, a comprehensive analysis by integrating GWAS, selective sweep, eQTL, TWAS, epigenomic profiling, and molecular validation highlighted a positively selected genomic region on Bos taurus autosome 6 (BTA6). This locus harbors pleiotropic genes (LAP3, MED28, and NCAPG) and a prioritized functional variant involved in the complex regulation of body weight. Additionally, convergent evolution analysis and phenome-wide association studies underscored the conservation of this locus across species. Our study provides a comprehensive understanding of the genetic regulation of body weight through multi-omics analysis in cattle. Our findings contribute to unraveling the genetic mechanisms governing weight-related traits and shed valuable light on the genetic improvement of farm animals.
{"title":"Comprehensive Multi-omics Analysis of Regulatory Variants for Body Weight in Cattle.","authors":"Qunhao Niu 牛群皓, Jiayuan Wu 武嘉远, Tianyi Wu 吴天弋, Tianliu Zhang 张天留, Tianzhen Wang 王添祯, Xu Zheng 郑旭, Zhida Zhao 赵志达, Ling Xu 徐玲, Zezhao Wang 王泽昭, Bo Zhu 朱波, Lupei Zhang 张路培, Huijiang Gao 高会江, George E Liu, Junya Li 李俊雅, Lingyang Xu 徐凌洋","doi":"10.1093/gpbjnl/qzaf067","DOIUrl":"10.1093/gpbjnl/qzaf067","url":null,"abstract":"<p><p>Body weight is a polygenic trait with intricate inheritance patterns. Functional genomics enriched with multi-layer annotations offers essential resources for exploring the genetic architecture of complex traits. In this study, we conducted an extensive characterization of regulatory variants associated with body weight-related traits in cattle using multi-omics analysis. First, we identified seven candidate genes by integrating selective sweep analysis and multiple genome-wide association study (GWAS) strategies using imputed whole-genome sequencing data from a population of 1577 individuals. Subsequently, we uncovered 3340 eGenes (genes whose expression levels are associated with genetic variants) across 227 muscle samples. Transcriptome-wide association studies (TWASs) further revealed a total of 532 distinct candidate genes associated with body weight-related traits. Colocalization analyses unveiled 44 genes shared between expression quantitative trait loci (eQTLs) and GWAS signals. Moreover, a comprehensive analysis by integrating GWAS, selective sweep, eQTL, TWAS, epigenomic profiling, and molecular validation highlighted a positively selected genomic region on Bos taurus autosome 6 (BTA6). This locus harbors pleiotropic genes (LAP3, MED28, and NCAPG) and a prioritized functional variant involved in the complex regulation of body weight. Additionally, convergent evolution analysis and phenome-wide association studies underscored the conservation of this locus across species. Our study provides a comprehensive understanding of the genetic regulation of body weight through multi-omics analysis in cattle. Our findings contribute to unraveling the genetic mechanisms governing weight-related traits and shed valuable light on the genetic improvement of farm animals.</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":7.9,"publicationDate":"2025-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12701805/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144877545","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yang Fang 方扬, Xueping Tian 田雪平, Yanling Jin 靳艳玲, Anping Du 杜安平, Yanqiang Ding 丁彦强, Zhihua Liao 廖志华, Kaize He 何开泽, Yonggui Zhao 赵永贵, Ling Guo 郭铃, Yao Xiao 肖瑶, Yaliang Xu 许亚良, Shuang Chen 陈爽, Yuqing Che 车育青, Li Tan 谭力, Songhu Wang 汪松虎, Jiatang Li 李家堂, Zhuolin Yi 易卓林, Lanchai Chen 陈兰钗, Leyi Zhao 赵乐伊, Fangyuan Zhang 张芳源, Guoyou Li 李国友, Jinmeng Li 李瑾萌, Qinli Xiong 熊勤犁, Yongmei Zhang 张咏梅, Qing Zhang 张庆, Xuan Hieu Cao, Hai Zhao 赵海
Terrestrialization is an important evolutionary process that plants experienced. However, little is known about how land plants acquired aquatic growth behaviors. Here, we integrate multiproxy evidence to elucidate the evolution of the aquatic plant duckweed. Three genera of duckweeds show chronologically gradual degeneration in root structure and stomatal function and a decrease in lignocellulose content, accompanied by the contraction of relevant gene families and/or a decline in their transcription levels. The number of genes in main phytohormone pathways is also gradually decreased. The coordinated action of genes involved in auxin signaling and rhizoid development causes a gradual decrease in adventitious roots. Additionally, the significant expansion of the flavonoid pathway is related to the adaptation of duckweeds to floating growth. This study reconstructs the evolutionary history of duckweeds, tracing its journey from land back to water - a reverse trajectory of early land plants.
{"title":"Duckweed Evolution: from Land back to Water.","authors":"Yang Fang 方扬, Xueping Tian 田雪平, Yanling Jin 靳艳玲, Anping Du 杜安平, Yanqiang Ding 丁彦强, Zhihua Liao 廖志华, Kaize He 何开泽, Yonggui Zhao 赵永贵, Ling Guo 郭铃, Yao Xiao 肖瑶, Yaliang Xu 许亚良, Shuang Chen 陈爽, Yuqing Che 车育青, Li Tan 谭力, Songhu Wang 汪松虎, Jiatang Li 李家堂, Zhuolin Yi 易卓林, Lanchai Chen 陈兰钗, Leyi Zhao 赵乐伊, Fangyuan Zhang 张芳源, Guoyou Li 李国友, Jinmeng Li 李瑾萌, Qinli Xiong 熊勤犁, Yongmei Zhang 张咏梅, Qing Zhang 张庆, Xuan Hieu Cao, Hai Zhao 赵海","doi":"10.1093/gpbjnl/qzaf074","DOIUrl":"10.1093/gpbjnl/qzaf074","url":null,"abstract":"<p><p>Terrestrialization is an important evolutionary process that plants experienced. However, little is known about how land plants acquired aquatic growth behaviors. Here, we integrate multiproxy evidence to elucidate the evolution of the aquatic plant duckweed. Three genera of duckweeds show chronologically gradual degeneration in root structure and stomatal function and a decrease in lignocellulose content, accompanied by the contraction of relevant gene families and/or a decline in their transcription levels. The number of genes in main phytohormone pathways is also gradually decreased. The coordinated action of genes involved in auxin signaling and rhizoid development causes a gradual decrease in adventitious roots. Additionally, the significant expansion of the flavonoid pathway is related to the adaptation of duckweeds to floating growth. This study reconstructs the evolutionary history of duckweeds, tracing its journey from land back to water - a reverse trajectory of early land plants.</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":7.9,"publicationDate":"2025-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12707978/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144983989","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Bitao Zhong, Rui Fan, Yue Ma, Xiangwen Ji, Qinghua Cui, Chunmei Cui
The advancements in deep learning algorithms for medical image analysis have garnered significant attention in recent years. While several studies have shown promising results, with models achieving or even surpassing human performance, translating these advancements into clinical practice is still accompanied by various challenges. A primary obstacle lies in the availability of large-scale, well-characterized datasets for validating the generalization of approaches. To address this challenge, we curated a diverse collection of medical image datasets from multiple public sources, containing 105 datasets and a total of 1,995,671 images. These images span 14 modalities, including X-ray, computed tomography, magnetic resonance imaging, optical coherence tomography, ultrasound, and endoscopy, and originate from 13 organs, such as the lung, brain, eye, and heart. Subsequently, we constructed an online database, MedImg, which incorporates and systematically organizes these medical images to facilitate data accessibility. MedImg serves as an intuitive and open-access platform for facilitating research in deep learning-based medical image analysis, accessible at https://www.cuilab.cn/medimg/.
{"title":"MedImg: An Integrated Database for Public Medical Images.","authors":"Bitao Zhong, Rui Fan, Yue Ma, Xiangwen Ji, Qinghua Cui, Chunmei Cui","doi":"10.1093/gpbjnl/qzaf068","DOIUrl":"10.1093/gpbjnl/qzaf068","url":null,"abstract":"<p><p>The advancements in deep learning algorithms for medical image analysis have garnered significant attention in recent years. While several studies have shown promising results, with models achieving or even surpassing human performance, translating these advancements into clinical practice is still accompanied by various challenges. A primary obstacle lies in the availability of large-scale, well-characterized datasets for validating the generalization of approaches. To address this challenge, we curated a diverse collection of medical image datasets from multiple public sources, containing 105 datasets and a total of 1,995,671 images. These images span 14 modalities, including X-ray, computed tomography, magnetic resonance imaging, optical coherence tomography, ultrasound, and endoscopy, and originate from 13 organs, such as the lung, brain, eye, and heart. Subsequently, we constructed an online database, MedImg, which incorporates and systematically organizes these medical images to facilitate data accessibility. MedImg serves as an intuitive and open-access platform for facilitating research in deep learning-based medical image analysis, accessible at https://www.cuilab.cn/medimg/.</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":7.9,"publicationDate":"2025-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12558383/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144984001","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xiuju Chen 陈秀菊, Yanyu Sui 隋彦禹, Jiayi Gu 顾佳怡, Liang Wang 王亮, Ningxia Sun 孙宁霞
The rise in infertility rates has prompted research into the impact of vaginal microbiota on female fertility and the success of assisted reproductive technology (ART). Our study aimed to compare the vaginal microbiome between fertile and infertile women and explore its influence on ART outcomes. Vaginal secretion samples were collected from 194 infertile women and 100 healthy controls at Shanghai Changzheng Hospital. The V3-V4 region of the 16S rRNA gene was amplified using polymerase chain reaction (PCR). A machine learning model was applied to predict infertility based on genus-level abundance, and the PICRUSt algorithm was employed to predict metabolic pathways related to infertility and ART outcomes. The results showed that infertile women exhibited a significantly different vaginal microbial composition compared to healthy controls, along with increased microbial diversity. Notably, the abundance of Burkholderia, Pseudomonas, and Prevotella was significantly elevated in the vaginal microbiota of the infertility group, while that of Bifidobacterium and Lactobacillus was reduced. Among infertile women, those with recurrent implantation failure (RIF) showed even higher vaginal microbial diversity, with specific genera such as Mobiluncus, Peptoniphilus, Prevotella, and Varibaculum being more abundant. Eleven metabolic pathways were identified to be associated with both RIF and infertility, with Prevotella showing stronger correlations with these pathways. This study elucidates differences in vaginal microbiome between healthy and infertile women, providing novel insights into how vaginal microbiota may impact infertility and ART outcomes. Our findings underscore the importance of specific microbial taxa in women with RIF, suggesting potential avenues for targeted interventions to improve embryo transplantation success rates.
{"title":"Implication of the Vaginal Microbiome in Female Infertility and Assisted Conception Outcomes.","authors":"Xiuju Chen 陈秀菊, Yanyu Sui 隋彦禹, Jiayi Gu 顾佳怡, Liang Wang 王亮, Ningxia Sun 孙宁霞","doi":"10.1093/gpbjnl/qzaf042","DOIUrl":"10.1093/gpbjnl/qzaf042","url":null,"abstract":"<p><p>The rise in infertility rates has prompted research into the impact of vaginal microbiota on female fertility and the success of assisted reproductive technology (ART). Our study aimed to compare the vaginal microbiome between fertile and infertile women and explore its influence on ART outcomes. Vaginal secretion samples were collected from 194 infertile women and 100 healthy controls at Shanghai Changzheng Hospital. The V3-V4 region of the 16S rRNA gene was amplified using polymerase chain reaction (PCR). A machine learning model was applied to predict infertility based on genus-level abundance, and the PICRUSt algorithm was employed to predict metabolic pathways related to infertility and ART outcomes. The results showed that infertile women exhibited a significantly different vaginal microbial composition compared to healthy controls, along with increased microbial diversity. Notably, the abundance of Burkholderia, Pseudomonas, and Prevotella was significantly elevated in the vaginal microbiota of the infertility group, while that of Bifidobacterium and Lactobacillus was reduced. Among infertile women, those with recurrent implantation failure (RIF) showed even higher vaginal microbial diversity, with specific genera such as Mobiluncus, Peptoniphilus, Prevotella, and Varibaculum being more abundant. Eleven metabolic pathways were identified to be associated with both RIF and infertility, with Prevotella showing stronger correlations with these pathways. This study elucidates differences in vaginal microbiome between healthy and infertile women, providing novel insights into how vaginal microbiota may impact infertility and ART outcomes. Our findings underscore the importance of specific microbial taxa in women with RIF, suggesting potential avenues for targeted interventions to improve embryo transplantation success rates.</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":7.9,"publicationDate":"2025-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12603359/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144047396","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}