首页 > 最新文献

Briefings in bioinformatics最新文献

英文 中文
An effective fragment-based dual conditional diffusion framework for molecular generation. 一种有效的基于片段的分子生成双条件扩散框架。
IF 7.7 2区 生物学 Q1 BIOCHEMICAL RESEARCH METHODS Pub Date : 2026-01-07 DOI: 10.1093/bib/bbaf727
Haotian Chen, Yiting Shen, Jichun Li, Weizhong Zhao

Fragment-based molecular generation has emerged as a promising paradigm in structure-based drug design (SBDD), deriving effective compounds with advanced properties, including chemical validity, synthetic feasibility, pharmacological relevance, etc. However, existing approaches often struggle with generating molecules which can both conform to 3D structural constraints and retain chemical plausibility. This is largely due to the fact that prior works often treat scaffolds and R-groups of molecules indiscriminately, overlooking the distinct semantic roles played by scaffolds and R-groups. Specifically, the scaffold serves as the rigid structural backbone that determines the global geometric topology and binding pose, whereas R-groups act as functional substituents responsible for fine-tuning local physicochemical interactions. Therefore, in this work, we propose fragment-based dual conditional diffusion (FDC-Diff), a novel dual conditional diffusion framework that integrates chemical priors and structural cues for fragment-based molecular generation. Unlike traditional de novo methods that generate atoms sequentially, FDC-Diff decomposes the molecule generation process into two semantically complementary stages. Given the protein pocket and an initial fragment, in the first stage, a spatially constrained scaffold is constructed to capture the global molecular topology. In the second stage, R-groups onto the obtained scaffold are elaborated to capture local semantics to further refine molecular properties. To ensure synthetic accessibility, initial fragments and scaffold-modification hierarchy are derived from curated reaction rules, and a physical-chemistry-inspired refinement step is applied to optimize final conformations. Experimental results on multiple SBDD benchmarks demonstrate that FDC-Diff achieves state-of-the-art performance in terms of comprehensive evaluations. Furthermore, our model excels at producing chemically valid, spatially compatible, and pharmacologically relevant molecules, suggesting its potential as a feasible tool for fragment-based drug design.

基于片段的分子生成已成为基于结构的药物设计(SBDD)的一个有前途的范例,它衍生出具有先进性能的有效化合物,包括化学有效性、合成可行性、药理相关性等。然而,现有的方法往往难以产生既符合3D结构约束又保持化学合理性的分子。这在很大程度上是由于先前的研究往往不加区分地对待分子的支架和r -基团,而忽略了支架和r -基团所起的不同的语义作用。具体来说,支架作为刚性结构骨干,决定了整体几何拓扑结构和结合姿态,而r -基团作为功能性取代基,负责微调局部物理化学相互作用。因此,在这项工作中,我们提出了基于片段的双条件扩散(FDC-Diff),这是一种新的双条件扩散框架,集成了基于片段的分子生成的化学先验和结构线索。与传统的按顺序生成原子的从头生成方法不同,FDC-Diff将分子生成过程分解为两个语义互补的阶段。考虑到蛋白质口袋和初始片段,在第一阶段,构建一个空间受限的支架来捕获全局分子拓扑结构。在第二阶段,对获得的支架上的r基团进行阐述,以捕获局部语义,进一步完善分子性质。为了确保合成的可及性,初始片段和支架修饰的层次结构是从精心策划的反应规则中衍生出来的,并采用物理化学启发的改进步骤来优化最终的构象。多个SBDD基准测试的实验结果表明,FDC-Diff在综合评估方面达到了最先进的性能。此外,我们的模型在生产化学上有效、空间相容和药理学相关的分子方面表现出色,这表明它有可能成为基于片段的药物设计的可行工具。
{"title":"An effective fragment-based dual conditional diffusion framework for molecular generation.","authors":"Haotian Chen, Yiting Shen, Jichun Li, Weizhong Zhao","doi":"10.1093/bib/bbaf727","DOIUrl":"10.1093/bib/bbaf727","url":null,"abstract":"<p><p>Fragment-based molecular generation has emerged as a promising paradigm in structure-based drug design (SBDD), deriving effective compounds with advanced properties, including chemical validity, synthetic feasibility, pharmacological relevance, etc. However, existing approaches often struggle with generating molecules which can both conform to 3D structural constraints and retain chemical plausibility. This is largely due to the fact that prior works often treat scaffolds and R-groups of molecules indiscriminately, overlooking the distinct semantic roles played by scaffolds and R-groups. Specifically, the scaffold serves as the rigid structural backbone that determines the global geometric topology and binding pose, whereas R-groups act as functional substituents responsible for fine-tuning local physicochemical interactions. Therefore, in this work, we propose fragment-based dual conditional diffusion (FDC-Diff), a novel dual conditional diffusion framework that integrates chemical priors and structural cues for fragment-based molecular generation. Unlike traditional de novo methods that generate atoms sequentially, FDC-Diff decomposes the molecule generation process into two semantically complementary stages. Given the protein pocket and an initial fragment, in the first stage, a spatially constrained scaffold is constructed to capture the global molecular topology. In the second stage, R-groups onto the obtained scaffold are elaborated to capture local semantics to further refine molecular properties. To ensure synthetic accessibility, initial fragments and scaffold-modification hierarchy are derived from curated reaction rules, and a physical-chemistry-inspired refinement step is applied to optimize final conformations. Experimental results on multiple SBDD benchmarks demonstrate that FDC-Diff achieves state-of-the-art performance in terms of comprehensive evaluations. Furthermore, our model excels at producing chemically valid, spatially compatible, and pharmacologically relevant molecules, suggesting its potential as a feasible tool for fragment-based drug design.</p>","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"27 1","pages":""},"PeriodicalIF":7.7,"publicationDate":"2026-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12814976/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146002891","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Advances and challenges in single-cell RNA sequencing data analysis: a comprehensive review. 单细胞RNA测序数据分析的进展与挑战:综述。
IF 7.7 2区 生物学 Q1 BIOCHEMICAL RESEARCH METHODS Pub Date : 2026-01-07 DOI: 10.1093/bib/bbaf723
Ali Mohammad Nesari, Habib MotieGhader, Saeid Ghorbian

Single-cell RNA sequencing (scRNA-seq) has transformed the resolution of cellular heterogeneity, offering insights into dynamic biological processes from tumor evolution to immune regulation. However, its clinical translation is limited by challenges such as data sparsity, batch effects (differences caused by technical variation rather than biology), and the absence of standardized benchmarks for core pipelines like Seurat and Scanpy. This review outlines emerging computational strategies that address these limitations: (A) robust preprocessing, including SCTransform for zero-inflation(an excess of zero counts in gene-expression data) correction and Harmony for batch integration-achieving 30% faster alignment than BBKNN in cohorts exceeding 100,000 cells; (B) transformer-based annotation tools such as scGPT and CellTypist, which reach >95% accuracy in immune profiling using models pretrained on 33 million cells; and (C) multimodal integration with spatial transcriptomics (e.g., 10x Visium, cell2location v2), which delineate microenvironmental niches and rare CX3CR1+ T-cell subsets in disease contexts like glioblastoma and severe COVID-19. We further assess how scANVI bridges scRNA-seq and ATAC-seq to uncover epigenetic mechanisms underlying therapy resistance, and how spatial methods elucidate tumor-immune crosstalk at subcellular resolution. Despite these advances, ethical risks remain, particularly around re-identification of rare patient-derived clones such as pre-metastatic cells. To promote clinical adoption, we propose a roadmap that prioritizes benchmarked workflows (e.g., scverse ecosystem), privacy-aware data sharing via federated learning, and causal AI approaches to disentangle biological signal from technical artifact. By synthesizing computational innovations with translational case studies, this review equips researchers to navigate both the analytical and ethical complexities of scRNA-seq in pursuit of actionable diagnostics.

单细胞RNA测序(scRNA-seq)已经改变了细胞异质性的分辨率,提供了从肿瘤进化到免疫调节的动态生物学过程的见解。然而,其临床转化受到数据稀疏性、批次效应(由技术差异而非生物学引起的差异)以及Seurat和Scanpy等核心管道缺乏标准化基准等挑战的限制。这篇综述概述了解决这些限制的新兴计算策略:(A)稳健的预处理,包括用于零膨胀(基因表达数据中超过零计数)校正的SCTransform和用于批量整合的Harmony,在超过10万个细胞的队列中,比BBKNN的比对速度快30%;(B)基于转换器的注释工具,如scGPT和CellTypist,使用在3300万个细胞上预训练的模型,在免疫谱分析中达到bb0 95%的准确率;(C)与空间转录组学的多模式整合(例如,10x Visium, cell2location v2),描绘了胶质母细胞瘤和严重COVID-19等疾病背景下的微环境利基和罕见的CX3CR1+ t细胞亚群。我们进一步评估了scANVI如何连接scRNA-seq和ATAC-seq来揭示治疗耐药性的表观遗传机制,以及空间方法如何在亚细胞分辨率上阐明肿瘤免疫串扰。尽管取得了这些进展,但伦理风险仍然存在,特别是在重新鉴定罕见的患者来源的克隆(如前转移细胞)方面。为了促进临床应用,我们提出了一个路线图,该路线图优先考虑基准工作流程(例如,横向生态系统),通过联邦学习进行隐私感知数据共享,以及通过因果人工智能方法从技术工件中分离生物信号。通过将计算创新与转化案例研究相结合,本综述使研究人员能够在追求可操作诊断的过程中导航scRNA-seq的分析和伦理复杂性。
{"title":"Advances and challenges in single-cell RNA sequencing data analysis: a comprehensive review.","authors":"Ali Mohammad Nesari, Habib MotieGhader, Saeid Ghorbian","doi":"10.1093/bib/bbaf723","DOIUrl":"10.1093/bib/bbaf723","url":null,"abstract":"<p><p>Single-cell RNA sequencing (scRNA-seq) has transformed the resolution of cellular heterogeneity, offering insights into dynamic biological processes from tumor evolution to immune regulation. However, its clinical translation is limited by challenges such as data sparsity, batch effects (differences caused by technical variation rather than biology), and the absence of standardized benchmarks for core pipelines like Seurat and Scanpy. This review outlines emerging computational strategies that address these limitations: (A) robust preprocessing, including SCTransform for zero-inflation(an excess of zero counts in gene-expression data) correction and Harmony for batch integration-achieving 30% faster alignment than BBKNN in cohorts exceeding 100,000 cells; (B) transformer-based annotation tools such as scGPT and CellTypist, which reach >95% accuracy in immune profiling using models pretrained on 33 million cells; and (C) multimodal integration with spatial transcriptomics (e.g., 10x Visium, cell2location v2), which delineate microenvironmental niches and rare CX3CR1+ T-cell subsets in disease contexts like glioblastoma and severe COVID-19. We further assess how scANVI bridges scRNA-seq and ATAC-seq to uncover epigenetic mechanisms underlying therapy resistance, and how spatial methods elucidate tumor-immune crosstalk at subcellular resolution. Despite these advances, ethical risks remain, particularly around re-identification of rare patient-derived clones such as pre-metastatic cells. To promote clinical adoption, we propose a roadmap that prioritizes benchmarked workflows (e.g., scverse ecosystem), privacy-aware data sharing via federated learning, and causal AI approaches to disentangle biological signal from technical artifact. By synthesizing computational innovations with translational case studies, this review equips researchers to navigate both the analytical and ethical complexities of scRNA-seq in pursuit of actionable diagnostics.</p>","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"27 1","pages":""},"PeriodicalIF":7.7,"publicationDate":"2026-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12860385/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146096646","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Detection of alternative splicing: deep sequencing or deep learning? 选择性剪接检测:深度测序还是深度学习?
IF 7.7 2区 生物学 Q1 BIOCHEMICAL RESEARCH METHODS Pub Date : 2026-01-07 DOI: 10.1093/bib/bbaf705
Lena Maria Hackl, Fabian Neuhaus, Sabine Ameling, Uwe Völker, Jan Baumbach, Olga Tsoy

Alternative splicing is a crucial mechanism of gene regulation that enables condition- and tissue-specific expression of gene isoforms. Its dysregulation plays a role in various diseases such as cancer, neurological disorders, and metabolic conditions. Despite its importance, accurate detection of alternative splicing events remains challenging. Comprehensive alternative splicing event detection typically requires deep sequencing with over 100 million reads; however, much of the publicly accessible RNA sequencing data is of lower sequencing depth. Recent advances, particularly deep learning models working with genomic sequences, offer new avenues for predicting alternative splicing without reliance on high sequencing depth data. Our study addresses the question: Can we utilize the vast repository of publicly available RNA sequencing data for comprehensive alternative splicing detection, despite the low sequencing depth? Our results demonstrate the potential of sequence-based deep learning tools such as AlphaGenome, SpliceAI and DeepSplice for initial hypothesis development and as additional filters in standard RNA sequencing pipelines, especially when sequencing depth is limited. Nonetheless, validation with higher sequencing depths remains essential for confirmation of splice events. Overall, our findings underscore the need for integrative methods combining genomic sequence data and RNA sequencing data for the prediction of tissue- and condition-specific alternative splicing in resource-limited settings.

选择性剪接是基因调控的一个关键机制,它使基因同种异构体的条件和组织特异性表达成为可能。它的失调在各种疾病如癌症、神经系统疾病和代谢疾病中起作用。尽管它很重要,但准确检测选择性剪接事件仍然具有挑战性。全面的选择性剪接事件检测通常需要超过1亿reads的深度测序;然而,许多可公开访问的RNA测序数据的测序深度较低。最近的进展,特别是与基因组序列一起工作的深度学习模型,为预测选择性剪接提供了新的途径,而不依赖于高测序深度数据。我们的研究解决了这样一个问题:尽管测序深度较低,但我们能否利用大量公开可用的RNA测序数据进行全面的替代剪接检测?我们的研究结果证明了基于序列的深度学习工具(如AlphaGenome、SpliceAI和DeepSplice)在初始假设开发和标准RNA测序管道中的附加过滤器方面的潜力,特别是在测序深度有限的情况下。尽管如此,更高测序深度的验证仍然是确认剪接事件的必要条件。总的来说,我们的研究结果强调了在资源有限的环境下,需要将基因组序列数据和RNA测序数据结合起来的综合方法来预测组织和条件特异性的选择性剪接。
{"title":"Detection of alternative splicing: deep sequencing or deep learning?","authors":"Lena Maria Hackl, Fabian Neuhaus, Sabine Ameling, Uwe Völker, Jan Baumbach, Olga Tsoy","doi":"10.1093/bib/bbaf705","DOIUrl":"10.1093/bib/bbaf705","url":null,"abstract":"<p><p>Alternative splicing is a crucial mechanism of gene regulation that enables condition- and tissue-specific expression of gene isoforms. Its dysregulation plays a role in various diseases such as cancer, neurological disorders, and metabolic conditions. Despite its importance, accurate detection of alternative splicing events remains challenging. Comprehensive alternative splicing event detection typically requires deep sequencing with over 100 million reads; however, much of the publicly accessible RNA sequencing data is of lower sequencing depth. Recent advances, particularly deep learning models working with genomic sequences, offer new avenues for predicting alternative splicing without reliance on high sequencing depth data. Our study addresses the question: Can we utilize the vast repository of publicly available RNA sequencing data for comprehensive alternative splicing detection, despite the low sequencing depth? Our results demonstrate the potential of sequence-based deep learning tools such as AlphaGenome, SpliceAI and DeepSplice for initial hypothesis development and as additional filters in standard RNA sequencing pipelines, especially when sequencing depth is limited. Nonetheless, validation with higher sequencing depths remains essential for confirmation of splice events. Overall, our findings underscore the need for integrative methods combining genomic sequence data and RNA sequencing data for the prediction of tissue- and condition-specific alternative splicing in resource-limited settings.</p>","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"27 1","pages":""},"PeriodicalIF":7.7,"publicationDate":"2026-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12790623/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145948453","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Systematic evaluation of computational tools to predict the effects of mutations on protein-ligand binding affinity in the absence of experimental structures. 在没有实验结构的情况下,系统地评估计算工具来预测突变对蛋白质配体结合亲和力的影响。
IF 7.7 2区 生物学 Q1 BIOCHEMICAL RESEARCH METHODS Pub Date : 2026-01-07 DOI: 10.1093/bib/bbag035
Qisheng Pan, Stephanie Portelli, Thanh Binh Nguyen, David B Ascher

Drug resistance caused by mutations is a significant global health concern. One way to better understand this phenomenon is by studying changes in protein-ligand binding affinity upon mutation. While recent advances in protein modelling, such as AlphaFold2 and AlphaFold3, have transformed structural assessments, their utility in predicting mutation-induced binding affinity changes remains underexplored. We evaluated various mutation-based methods and scoring functions using computer-generated protein-ligand complexes. Compared to a baseline using experimental structures, we observed a performance drop ranging from 5% to 30% across different computational models. Specifically, using experimental receptors with docked ligands resulted in a ~5% drop, similar to that observed with AlphaFold3 models (~5%), despite the latter offering lower ligand root mean square deviation. However, using AlphaFold2 receptors with docking led to a greater performance loss (10%-20%), comparable to homology models with high sequence identity. Homology models based on low-identity templates showed over 30% decline. These performance differences were most pronounced for interface mutations and low molecular weight ligands. While AlphaFold models offer accurate protein and interaction predictions, they lack mutation-specific information, such as dynamic changes, highlighting the need for complementary mutation-aware methods for reliable analysis. Our findings provide insights into interpreting mutation effects on ligand binding using predicted structures and can guide more robust assessments of drug resistance mechanisms in silico.

突变引起的耐药性是一个重大的全球健康问题。更好地理解这一现象的一种方法是研究突变时蛋白质与配体结合亲和力的变化。虽然最近在蛋白质建模方面的进展,如AlphaFold2和AlphaFold3,已经改变了结构评估,但它们在预测突变诱导的结合亲和力变化方面的效用仍未得到充分探索。我们使用计算机生成的蛋白质配体复合物评估了各种基于突变的方法和评分功能。与使用实验结构的基线相比,我们观察到不同计算模型的性能下降幅度从5%到30%不等。具体来说,使用对接配体的实验受体导致~5%的下降,与AlphaFold3模型相似(~5%),尽管后者提供更低的配体均方根偏差。然而,使用对接的AlphaFold2受体导致更大的性能损失(10%-20%),与具有高序列一致性的同源模型相当。基于低同一性模板的同源模型下降了30%以上。这些性能差异在界面突变和低分子量配体中最为明显。虽然AlphaFold模型提供了准确的蛋白质和相互作用预测,但它们缺乏突变特异性信息,例如动态变化,因此需要补充突变感知方法来进行可靠的分析。我们的研究结果为利用预测结构解释配体结合的突变效应提供了见解,并可以指导更可靠的硅耐药机制评估。
{"title":"Systematic evaluation of computational tools to predict the effects of mutations on protein-ligand binding affinity in the absence of experimental structures.","authors":"Qisheng Pan, Stephanie Portelli, Thanh Binh Nguyen, David B Ascher","doi":"10.1093/bib/bbag035","DOIUrl":"10.1093/bib/bbag035","url":null,"abstract":"<p><p>Drug resistance caused by mutations is a significant global health concern. One way to better understand this phenomenon is by studying changes in protein-ligand binding affinity upon mutation. While recent advances in protein modelling, such as AlphaFold2 and AlphaFold3, have transformed structural assessments, their utility in predicting mutation-induced binding affinity changes remains underexplored. We evaluated various mutation-based methods and scoring functions using computer-generated protein-ligand complexes. Compared to a baseline using experimental structures, we observed a performance drop ranging from 5% to 30% across different computational models. Specifically, using experimental receptors with docked ligands resulted in a ~5% drop, similar to that observed with AlphaFold3 models (~5%), despite the latter offering lower ligand root mean square deviation. However, using AlphaFold2 receptors with docking led to a greater performance loss (10%-20%), comparable to homology models with high sequence identity. Homology models based on low-identity templates showed over 30% decline. These performance differences were most pronounced for interface mutations and low molecular weight ligands. While AlphaFold models offer accurate protein and interaction predictions, they lack mutation-specific information, such as dynamic changes, highlighting the need for complementary mutation-aware methods for reliable analysis. Our findings provide insights into interpreting mutation effects on ligand binding using predicted structures and can guide more robust assessments of drug resistance mechanisms in silico.</p>","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"27 1","pages":""},"PeriodicalIF":7.7,"publicationDate":"2026-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12874888/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146123814","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ANIA: an inception-attention network for predicting minimum inhibitory concentration of antimicrobial peptides. ANIA:用于预测抗菌肽最小抑制浓度的起始-注意网络。
IF 7.7 2区 生物学 Q1 BIOCHEMICAL RESEARCH METHODS Pub Date : 2026-01-07 DOI: 10.1093/bib/bbag023
Yen-Peng Chiu, Lantian Yao, Yun Tang, Chia-Ru Chung, Yuxuan Pang, Ying-Chih Chiang, Tzong-Yi Lee

Antimicrobial resistance poses a significant challenge to conventional antibiotics, underscoring the urgent need for alternative therapeutic strategies. Antimicrobial peptides (AMPs) have emerged as promising candidates due to their broad-spectrum antibacterial activity and distinct mechanisms of action. This study presents ANIA, a deep learning framework developed to predict the minimum inhibitory concentration (MIC) values of AMPs against three clinically significant bacteria: Staphylococcus aureus, Escherichia coli, and Pseudomonas aeruginosa. ANIA leverages Chaos Game Representation (CGR) to transform AMP sequences into frequency-based image features, which are subsequently processed through a hybrid architecture comprising stacked Inception modules, a Transformer encoder, and a regression head. This integrative architecture enables ANIA to capture both local motif-based features and global contextual patterns embedded within AMP sequences. In benchmarking experiments, ANIA achieved notably superior performance compared to existing tools, including ESKAPEE-Pred, AMPActiPred, and esAMPMIC, achieving higher correlation coefficients and lower predictive errors across all bacteria targets, with the most pronounced improvement observed for P. aeruginosa, a pathogen renowned for its multidrug resistance. Specifically, ANIA achieved PCCs of 0.75-0.79 and MSEs of 0.23-0.26 across all species. Furthermore, motif-based interpretability analyses combining Grad-CAM visualizations, correlation heatmaps, motif frequency distributions, and hydrophobicity profiling revealed biologically meaningful subregions within the CGR matrix that are plausibly associated with antimicrobial efficacy. In conclusion, this study develops ANIA as a robust predictive tool for MIC estimation, offering valuable insights into the design of effective antimicrobial agents and contributing to the fight against antimicrobial resistance. A user-friendly web server for ANIA is available at https://biomics.lab.nycu.edu.tw/ANIA/.

抗菌素耐药性对传统抗生素构成重大挑战,强调迫切需要替代治疗策略。抗菌肽(AMPs)由于其广谱抗菌活性和独特的作用机制而成为有希望的候选者。本研究提出了ANIA,这是一个深度学习框架,用于预测抗菌肽对三种临床重要细菌的最低抑制浓度(MIC)值:金黄色葡萄球菌、大肠杆菌和铜绿假单胞菌。ANIA利用混沌游戏表示(CGR)将AMP序列转换为基于频率的图像特征,随后通过由堆叠的Inception模块、Transformer编码器和回归头组成的混合架构进行处理。这种集成的架构使ANIA能够捕获本地基于图案的特征和嵌入在AMP序列中的全局上下文模式。在基准测试实验中,与eskapape - pred、AMPActiPred和esAMPMIC等现有工具相比,ANIA取得了显著的优异性能,在所有细菌靶标上实现了更高的相关系数和更低的预测误差,其中对铜绿假单胞菌(P. aeruginosa,一种以多药耐药而闻名的病原体)的改善最为显著。具体而言,ANIA所有物种的PCCs为0.75-0.79,mse为0.23-0.26。此外,基于基序的可解释性分析结合了Grad-CAM可视化、相关热图、基序频率分布和疏水性分析,揭示了CGR矩阵中具有生物学意义的亚区,这些亚区可能与抗菌功效相关。总之,本研究将ANIA发展为MIC估计的强大预测工具,为有效抗菌药物的设计提供了有价值的见解,并有助于对抗抗菌药物耐药性。一个用户友好的网络服务器可以在https://biomics.lab.nycu.edu.tw/ANIA/上找到。
{"title":"ANIA: an inception-attention network for predicting minimum inhibitory concentration of antimicrobial peptides.","authors":"Yen-Peng Chiu, Lantian Yao, Yun Tang, Chia-Ru Chung, Yuxuan Pang, Ying-Chih Chiang, Tzong-Yi Lee","doi":"10.1093/bib/bbag023","DOIUrl":"https://doi.org/10.1093/bib/bbag023","url":null,"abstract":"<p><p>Antimicrobial resistance poses a significant challenge to conventional antibiotics, underscoring the urgent need for alternative therapeutic strategies. Antimicrobial peptides (AMPs) have emerged as promising candidates due to their broad-spectrum antibacterial activity and distinct mechanisms of action. This study presents ANIA, a deep learning framework developed to predict the minimum inhibitory concentration (MIC) values of AMPs against three clinically significant bacteria: Staphylococcus aureus, Escherichia coli, and Pseudomonas aeruginosa. ANIA leverages Chaos Game Representation (CGR) to transform AMP sequences into frequency-based image features, which are subsequently processed through a hybrid architecture comprising stacked Inception modules, a Transformer encoder, and a regression head. This integrative architecture enables ANIA to capture both local motif-based features and global contextual patterns embedded within AMP sequences. In benchmarking experiments, ANIA achieved notably superior performance compared to existing tools, including ESKAPEE-Pred, AMPActiPred, and esAMPMIC, achieving higher correlation coefficients and lower predictive errors across all bacteria targets, with the most pronounced improvement observed for P. aeruginosa, a pathogen renowned for its multidrug resistance. Specifically, ANIA achieved PCCs of 0.75-0.79 and MSEs of 0.23-0.26 across all species. Furthermore, motif-based interpretability analyses combining Grad-CAM visualizations, correlation heatmaps, motif frequency distributions, and hydrophobicity profiling revealed biologically meaningful subregions within the CGR matrix that are plausibly associated with antimicrobial efficacy. In conclusion, this study develops ANIA as a robust predictive tool for MIC estimation, offering valuable insights into the design of effective antimicrobial agents and contributing to the fight against antimicrobial resistance. A user-friendly web server for ANIA is available at https://biomics.lab.nycu.edu.tw/ANIA/.</p>","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"27 1","pages":""},"PeriodicalIF":7.7,"publicationDate":"2026-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146149120","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Divergent Eurasian ancestry and local adaptation shape the genetic landscapes of the Yugur and Uyghur. 不同的欧亚血统和当地的适应形成了裕固族和维吾尔族的遗传景观。
IF 7.7 2区 生物学 Q1 BIOCHEMICAL RESEARCH METHODS Pub Date : 2026-01-07 DOI: 10.1093/bib/bbag041
Siyong Yu, Jia Wen, Yang Gao, Zhaoqing Yang, Xu Wang, Yan Lu, Jiayou Chu, Dilinuer Maimaitiyiming, Shuhua Xu

The Yugur and Uyghur people of northwestern China share documented Early Medieval origins, yet the evolutionary processes that shaped their present-day genomes remain unresolved. Here, we generate high-coverage whole-genome sequences for the Yugurs and compare them with Uyghur genomes to reconstruct their demographic histories, ancestry profiles, and adaptive trajectories. Both groups derive from mixtures of East Eurasian ancestry (EEA) and West Eurasian ancestry (WEA) but in sharply contrasting proportions: the Yugur retain predominantly EEA (~90%), whereas the Uyghur harbor a near-equal balance. Modeling reveals distinct episodes of admixture in Gansu and Xinjiang, with identity-by-descent patterns indicating persistent but substantially reduced genetic continuity (FST = 0.021). Strikingly, despite their EEA-rich background, the Yugur show WEA-shifted allele frequencies at craniofacial loci, including EDAR and LIMS1, suggesting subtle trait convergence. Signals of recent positive selection further differentiate the two populations: the Yugur display strong selection on the FADS locus linked to lipid metabolism, whereas both groups exhibit selection at PPARA but with greater intensity in the Uyghur, consistent with their higher WEA. Functional enrichment analyses highlight overlapping immune and metabolic pathways, consistent with shared biological patterns shaped by demographic history and long-term residence in Northwestern China. Together, these findings show how divergent admixture proportions and region-specific natural selection have produced distinct genomic architectures in two historically related populations along the Silk Road.

居住在中国西北部的裕固族和维吾尔族有共同的中世纪早期起源,但形成他们今天基因组的进化过程仍未得到解决。在这里,我们生成了裕固族的高覆盖全基因组序列,并将其与维吾尔族基因组进行比较,以重建他们的人口历史、祖先概况和适应轨迹。这两个群体都来自东欧亚血统(EEA)和西欧亚血统(WEA)的混合物,但比例截然不同:裕固族主要保留EEA(~90%),而维吾尔族则拥有接近相等的平衡。建模结果显示,甘肃和新疆地区存在明显的混合现象,血统识别模式表明遗传连续性持续存在,但显著降低(FST = 0.021)。引人注目的是,尽管他们拥有丰富的eea背景,但Yugur人在颅面基因座(包括EDAR和LIMS1)上显示了wea移位的等位基因频率,这表明了微妙的性状趋同。最近的阳性选择信号进一步区分了两个种群:裕固族在与脂质代谢相关的FADS位点上表现出强烈的选择,而两组在PPARA上都表现出选择,但维吾尔族的选择强度更大,这与他们较高的WEA一致。功能富集分析强调了免疫和代谢途径的重叠,这与中国西北地区人口历史和长期居住形成的共同生物模式相一致。总之,这些发现显示了不同的混合比例和区域特异性自然选择如何在丝绸之路上两个历史上相关的人群中产生不同的基因组结构。
{"title":"Divergent Eurasian ancestry and local adaptation shape the genetic landscapes of the Yugur and Uyghur.","authors":"Siyong Yu, Jia Wen, Yang Gao, Zhaoqing Yang, Xu Wang, Yan Lu, Jiayou Chu, Dilinuer Maimaitiyiming, Shuhua Xu","doi":"10.1093/bib/bbag041","DOIUrl":"https://doi.org/10.1093/bib/bbag041","url":null,"abstract":"<p><p>The Yugur and Uyghur people of northwestern China share documented Early Medieval origins, yet the evolutionary processes that shaped their present-day genomes remain unresolved. Here, we generate high-coverage whole-genome sequences for the Yugurs and compare them with Uyghur genomes to reconstruct their demographic histories, ancestry profiles, and adaptive trajectories. Both groups derive from mixtures of East Eurasian ancestry (EEA) and West Eurasian ancestry (WEA) but in sharply contrasting proportions: the Yugur retain predominantly EEA (~90%), whereas the Uyghur harbor a near-equal balance. Modeling reveals distinct episodes of admixture in Gansu and Xinjiang, with identity-by-descent patterns indicating persistent but substantially reduced genetic continuity (FST = 0.021). Strikingly, despite their EEA-rich background, the Yugur show WEA-shifted allele frequencies at craniofacial loci, including EDAR and LIMS1, suggesting subtle trait convergence. Signals of recent positive selection further differentiate the two populations: the Yugur display strong selection on the FADS locus linked to lipid metabolism, whereas both groups exhibit selection at PPARA but with greater intensity in the Uyghur, consistent with their higher WEA. Functional enrichment analyses highlight overlapping immune and metabolic pathways, consistent with shared biological patterns shaped by demographic history and long-term residence in Northwestern China. Together, these findings show how divergent admixture proportions and region-specific natural selection have produced distinct genomic architectures in two historically related populations along the Silk Road.</p>","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"27 1","pages":""},"PeriodicalIF":7.7,"publicationDate":"2026-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146149167","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A novel two-sample Mendelian randomization framework integrating common and rare variants: application to assess the effect of HDL-C on preeclampsia risk. 整合常见和罕见变异的新型双样本孟德尔随机化框架:用于评估HDL-C对子痫前期风险的影响。
IF 7.7 2区 生物学 Q1 BIOCHEMICAL RESEARCH METHODS Pub Date : 2026-01-07 DOI: 10.1093/bib/bbaf649
Yu Zhang, Ming Li, David M Haas, C Noel Bairey Merz, Tsegaselassie Workalemahu, Kelli Ryckman, Janet M Catov, Lisa D Levine, Alexa Freedman, George R Saade, Jiaqi Hu, Hongyu Zhao, Xihao Li, Nianjun Liu, Qi Yan

Mendelian randomization (MR) has become an important technique for establishing causal relationships between risk factors and health outcomes. By using genetic variants as instrumental variables, it can mitigate bias due to confounding and reverse causation in observational studies. Current MR analyses have predominantly used common genetic variants as instruments, which represent only part of the genetic architecture of complex traits. Rare variants, which can have larger effect sizes and provide unique biological insights, have been understudied due to statistical and methodological challenges. We introduce MR-common and annotation-informed rare variants (MR-CARV), a novel framework integrating common and rare genetic variants in two-sample MR. This method leverages comprehensive genetic data made available by high-throughput sequencing technologies and large-scale consortia. Rare variants are aggregated into functional categories, such as gene-coding, gene-noncoding, and nongene regions, by leveraging variant annotations and biological impact as weights. The effects of rare variant sets are then estimated with STAARpipeline and combined with the estimated effects of common variants by the existing MR methods. Simulation studies demonstrate that MR-CARV maintains robust type I error and achieves higher statistical power, with up to a 66.3% relative increase compared with existing methods only based on common variants. Consistent with these findings, application to real data on high-density lipoprotein cholesterol (HDL-C) and preeclampsia showed that MR-CARV [inverse variance weighted (IVW)] yielded a more precise and statistically significant effect estimate (-0.020, SE = 0.0102, $P$ =.0470) than IVW using only common variants (-0.023, SE = 0.0123, $P$ =.0659).

孟德尔随机化(MR)已成为建立危险因素与健康结果之间因果关系的重要技术。通过使用遗传变异作为工具变量,可以减轻观察性研究中由于混杂和反向因果关系而产生的偏倚。目前的MR分析主要使用常见的遗传变异作为工具,这只代表了复杂性状的部分遗传结构。由于统计和方法上的挑战,罕见的变异,可以有更大的效应大小和提供独特的生物学见解,一直没有得到充分的研究。我们介绍了MR-common和annotation-informed rare variant (MR-CARV),这是一种在两样本mr中整合常见和罕见遗传变异的新框架。这种方法利用了高通量测序技术和大规模联盟提供的全面遗传数据。通过利用变异注释和生物影响作为权重,将罕见的变异聚合到功能类别中,例如基因编码、基因非编码和非基因区域。然后利用STAARpipeline估计罕见变异集的影响,并结合现有MR方法估计常见变异集的影响。仿真研究表明,MR-CARV保持了鲁棒的I型误差,并获得了更高的统计功率,与仅基于常见变量的现有方法相比,相对提高了66.3%。与这些发现一致的是,将MR-CARV[逆方差加权(IVW)]应用于高密度脂蛋白胆固醇(HDL-C)和子痫前期的真实数据显示,MR-CARV[逆方差加权(IVW)]比仅使用常见变异的IVW (-0.023, SE = 0.0123, P$ = 0.059)产生了更精确且具有统计学意义的效应估计(-0.020,SE = 0.0102, $P$ = 0.0470)。
{"title":"A novel two-sample Mendelian randomization framework integrating common and rare variants: application to assess the effect of HDL-C on preeclampsia risk.","authors":"Yu Zhang, Ming Li, David M Haas, C Noel Bairey Merz, Tsegaselassie Workalemahu, Kelli Ryckman, Janet M Catov, Lisa D Levine, Alexa Freedman, George R Saade, Jiaqi Hu, Hongyu Zhao, Xihao Li, Nianjun Liu, Qi Yan","doi":"10.1093/bib/bbaf649","DOIUrl":"10.1093/bib/bbaf649","url":null,"abstract":"<p><p>Mendelian randomization (MR) has become an important technique for establishing causal relationships between risk factors and health outcomes. By using genetic variants as instrumental variables, it can mitigate bias due to confounding and reverse causation in observational studies. Current MR analyses have predominantly used common genetic variants as instruments, which represent only part of the genetic architecture of complex traits. Rare variants, which can have larger effect sizes and provide unique biological insights, have been understudied due to statistical and methodological challenges. We introduce MR-common and annotation-informed rare variants (MR-CARV), a novel framework integrating common and rare genetic variants in two-sample MR. This method leverages comprehensive genetic data made available by high-throughput sequencing technologies and large-scale consortia. Rare variants are aggregated into functional categories, such as gene-coding, gene-noncoding, and nongene regions, by leveraging variant annotations and biological impact as weights. The effects of rare variant sets are then estimated with STAARpipeline and combined with the estimated effects of common variants by the existing MR methods. Simulation studies demonstrate that MR-CARV maintains robust type I error and achieves higher statistical power, with up to a 66.3% relative increase compared with existing methods only based on common variants. Consistent with these findings, application to real data on high-density lipoprotein cholesterol (HDL-C) and preeclampsia showed that MR-CARV [inverse variance weighted (IVW)] yielded a more precise and statistically significant effect estimate (-0.020, SE = 0.0102, $P$ =.0470) than IVW using only common variants (-0.023, SE = 0.0123, $P$ =.0659).</p>","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"27 1","pages":""},"PeriodicalIF":7.7,"publicationDate":"2026-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12777983/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145917110","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Cross-ancestry information transfer framework improves protein abundance prediction and protein-trait association identification. 跨祖先信息传递框架改进了蛋白质丰度预测和蛋白质性状关联鉴定。
IF 7.7 2区 生物学 Q1 BIOCHEMICAL RESEARCH METHODS Pub Date : 2026-01-07 DOI: 10.1093/bib/bbaf707
Wenli Zhai, Lingyun Sun, Wenwei Fang, Yidan Dong, Chunxiao Cheng, Yuanjiao Liu, Yuan Zhou, Jiadong Ji, Lang Wu, An Pan, Eric R Gamazon, Xiong-Fei Pan, Dan Zhou

Genetics-informed proteome-wide association studies (PWASs) provide an effective way to uncover proteomic mechanisms underlying complex diseases. PWAS relies on an ancestry-matched reference panel to model the impact of genetically determined protein expression on phenotype. However, reference panels from underrepresented populations remain relatively limited. We developed a multi-ancestry framework to enhance protein prediction in these populations by integrating diverse information-sharing strategies into a Multi-Ancestry Best-performing Model (MABM). Results indicated that MABM increased the prediction performance with higher performance observed in both cross-validation and an external dataset. Leveraging the Biobank Japan, we identified three times as many significant PWAS associations using MABM as using Lasso model. Notably, 47.5% of the MABM specific associations were reproduced in independent East Asian datasets with concordant effect sizes. Furthermore, MABM enhanced decision-making in gene/protein prioritization for functional validation for complex traits by validating well-established associations and uncovering novel trait-related candidates. The benefits of MABM were further validated in additional ancestries and demonstrated in brain tissue-based PWAS, underscoring its broad applicability. Our findings close critical gaps in multi-omics research among underrepresented populations and facilitate trait-relevant protein discovery in underrepresented populations.

遗传信息蛋白质组关联研究(PWASs)为揭示复杂疾病背后的蛋白质组机制提供了一种有效的方法。PWAS依赖于一个祖先匹配的参考面板来模拟基因决定的蛋白质表达对表型的影响。然而,来自代表性不足人口的参考小组仍然相对有限。我们开发了一个多祖先框架,通过将不同的信息共享策略整合到多祖先最佳表现模型(MABM)中来增强这些人群的蛋白质预测。结果表明,MABM提高了预测性能,在交叉验证和外部数据集中都观察到更高的性能。利用日本生物银行,我们发现使用MABM的PWAS关联是使用Lasso模型的三倍。值得注意的是,47.5%的MABM特异性关联在具有一致效应量的独立东亚数据集中重现。此外,MABM通过验证已建立的关联和发现新的性状相关候选者,增强了复杂性状功能验证中基因/蛋白优先级的决策。MABM的益处在其他祖先中得到进一步验证,并在基于脑组织的PWAS中得到证实,强调了其广泛的适用性。我们的研究结果填补了代表性不足人群中多组学研究的关键空白,并促进了代表性不足人群中性状相关蛋白的发现。
{"title":"Cross-ancestry information transfer framework improves protein abundance prediction and protein-trait association identification.","authors":"Wenli Zhai, Lingyun Sun, Wenwei Fang, Yidan Dong, Chunxiao Cheng, Yuanjiao Liu, Yuan Zhou, Jiadong Ji, Lang Wu, An Pan, Eric R Gamazon, Xiong-Fei Pan, Dan Zhou","doi":"10.1093/bib/bbaf707","DOIUrl":"10.1093/bib/bbaf707","url":null,"abstract":"<p><p>Genetics-informed proteome-wide association studies (PWASs) provide an effective way to uncover proteomic mechanisms underlying complex diseases. PWAS relies on an ancestry-matched reference panel to model the impact of genetically determined protein expression on phenotype. However, reference panels from underrepresented populations remain relatively limited. We developed a multi-ancestry framework to enhance protein prediction in these populations by integrating diverse information-sharing strategies into a Multi-Ancestry Best-performing Model (MABM). Results indicated that MABM increased the prediction performance with higher performance observed in both cross-validation and an external dataset. Leveraging the Biobank Japan, we identified three times as many significant PWAS associations using MABM as using Lasso model. Notably, 47.5% of the MABM specific associations were reproduced in independent East Asian datasets with concordant effect sizes. Furthermore, MABM enhanced decision-making in gene/protein prioritization for functional validation for complex traits by validating well-established associations and uncovering novel trait-related candidates. The benefits of MABM were further validated in additional ancestries and demonstrated in brain tissue-based PWAS, underscoring its broad applicability. Our findings close critical gaps in multi-omics research among underrepresented populations and facilitate trait-relevant protein discovery in underrepresented populations.</p>","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"27 1","pages":""},"PeriodicalIF":7.7,"publicationDate":"2026-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12777707/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145917075","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
iceDP: identifying inter-chromatin engagement via density peaks clustering algorithm. iceDP:通过密度峰聚类算法识别染色质间接合。
IF 7.7 2区 生物学 Q1 BIOCHEMICAL RESEARCH METHODS Pub Date : 2026-01-07 DOI: 10.1093/bib/bbaf704
Ruhai Chen, Jiekai Chen, Lingling Shi, Jiangping He

Chromatin topological structure is critical for gene regulation. Hi-C based experiments have significantly advanced our understanding chromatin organization. Numerous computational tools have been developed to identify various structural levels of chromatin, ranging from compartments to loops. However, there remains a lack of specialized tools for identifying non-homologous inter-chromatin contacts (NHCCs), which play important roles in chromosome territories. In this study, we present iceDP, a tool that leverages the Density Peaks clustering algorithm to identify local high-density regions within inter-chromatin. These regions undergo two subsequent filtering steps to eliminate obvious false positives. When applied to three Hi-C datasets, iceDP accurately identified known NHCCs, including olfactory receptor genes in mature olfactory sensory neurons and Polycomb repressive complex-regulated developmental genes in mouse embryonic stem cells (mESCs). Notably, iceDP also uncovered previously unreported transcriptionally active NHCCs. Compared to diffHiC and FitHiC, iceDP exhibited superior performance with the highest positive rate. Moreover, iceDP is compatible with a wide range of chromatin conformation capture techniques, including in-situ Hi-C, Micro-C, HiChIP, and BL-HiC, demonstrating its versatility and utility.

染色质拓扑结构对基因调控至关重要。基于Hi-C的实验极大地促进了我们对染色质组织的理解。已经开发了许多计算工具来识别染色质的各种结构水平,从隔室到环。然而,仍然缺乏专门的工具来识别非同源染色质间接触(nhcc),它在染色体区域中起着重要作用。在这项研究中,我们提出了iceDP,一个利用密度峰聚类算法来识别染色质间局部高密度区域的工具。这些区域经过两个后续的过滤步骤,以消除明显的误报。当应用于三个high - c数据集时,iceDP准确地鉴定了已知的nhcc,包括成熟嗅觉感觉神经元中的嗅觉受体基因和小鼠胚胎干细胞(mESCs)中的Polycomb抑制复合物调节的发育基因。值得注意的是,iceDP还发现了以前未报道的转录活性nhcc。与diffHiC和FitHiC相比,iceDP表现出更好的性能,阳性率最高。此外,iceDP与广泛的染色质构象捕获技术兼容,包括原位Hi-C、Micro-C、HiChIP和bl - hc,显示了其通用性和实用性。
{"title":"iceDP: identifying inter-chromatin engagement via density peaks clustering algorithm.","authors":"Ruhai Chen, Jiekai Chen, Lingling Shi, Jiangping He","doi":"10.1093/bib/bbaf704","DOIUrl":"10.1093/bib/bbaf704","url":null,"abstract":"<p><p>Chromatin topological structure is critical for gene regulation. Hi-C based experiments have significantly advanced our understanding chromatin organization. Numerous computational tools have been developed to identify various structural levels of chromatin, ranging from compartments to loops. However, there remains a lack of specialized tools for identifying non-homologous inter-chromatin contacts (NHCCs), which play important roles in chromosome territories. In this study, we present iceDP, a tool that leverages the Density Peaks clustering algorithm to identify local high-density regions within inter-chromatin. These regions undergo two subsequent filtering steps to eliminate obvious false positives. When applied to three Hi-C datasets, iceDP accurately identified known NHCCs, including olfactory receptor genes in mature olfactory sensory neurons and Polycomb repressive complex-regulated developmental genes in mouse embryonic stem cells (mESCs). Notably, iceDP also uncovered previously unreported transcriptionally active NHCCs. Compared to diffHiC and FitHiC, iceDP exhibited superior performance with the highest positive rate. Moreover, iceDP is compatible with a wide range of chromatin conformation capture techniques, including in-situ Hi-C, Micro-C, HiChIP, and BL-HiC, demonstrating its versatility and utility.</p>","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"27 1","pages":""},"PeriodicalIF":7.7,"publicationDate":"2026-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12777978/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145917093","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CircRM: profiling circular RNA modifications from nanopore direct RNA sequencing. CircRM:从纳米孔直接RNA测序分析环状RNA修饰。
IF 7.7 2区 生物学 Q1 BIOCHEMICAL RESEARCH METHODS Pub Date : 2026-01-07 DOI: 10.1093/bib/bbaf726
Jiayi Li, Shenglun Chen, Zhixing Wu, Haozhe Wang, Rong Xia, Jia Meng, Yuxin Zhang

Circular RNA (circRNA) represents a critical class of regulatory RNAs with distinctive structural and functional features. The functions of circRNAs are modulated by various RNA modifications. Here, we present CircRM, a nanopore direct RNA sequencing-based computational method for profiling RNA modifications in circRNAs at single-base and single-molecule resolution. By integrating circRNA detection, read-level modification detection, and quantitative assessment of methylation rates, CircRM identified 427 high-confidence circRNAs and enables systematic characterization of three major modifications, m5C (AUC = 0.855), m6A (AUC = 0.817) and m1A (AUC = 0.769). It revealed distinct modification patterns compared with linear RNAs, highlighting RNA-type-specific regulations. We also identified the key features of circRNA-specific modifications, such as the enrichment near the back-splice junctions. Cross-cell line analyses further demonstrated conserved and cell-type-specific modification patterns. Together, these findings reveal, at the computational level, a unique epitranscriptomic landscape associated with circRNAs and establish CircRM as a powerful tool for advancing the study of RNA modifications in circular RNA biology. CircRM is free accessible at: https://github.com/jiayiAnnie17/CircRM.

环状RNA (circRNA)是一类具有独特结构和功能特征的关键调控RNA。环状RNA的功能受到各种RNA修饰的调节。在这里,我们提出了CircRM,一种基于纳米孔直接RNA测序的计算方法,用于在单碱基和单分子分辨率下分析circRNAs中的RNA修饰。通过整合circRNA检测、读级修饰检测和甲基化率定量评估,CircRM鉴定出427个高置信度的circRNA,并能够系统表征三种主要修饰,m5C (AUC = 0.855)、m6A (AUC = 0.817)和m1A (AUC = 0.769)。与线性rna相比,它揭示了不同的修饰模式,突出了rna类型特异性调控。我们还确定了circrna特异性修饰的关键特征,例如后剪接连接处附近的富集。跨细胞系分析进一步证明了保守的和细胞类型特异性的修饰模式。总之,这些发现在计算水平上揭示了与环状RNA相关的独特的表转录组学景观,并使CircRM成为推进环状RNA生物学中RNA修饰研究的有力工具。CircRM可以免费访问:https://github.com/jiayiAnnie17/CircRM。
{"title":"CircRM: profiling circular RNA modifications from nanopore direct RNA sequencing.","authors":"Jiayi Li, Shenglun Chen, Zhixing Wu, Haozhe Wang, Rong Xia, Jia Meng, Yuxin Zhang","doi":"10.1093/bib/bbaf726","DOIUrl":"10.1093/bib/bbaf726","url":null,"abstract":"<p><p>Circular RNA (circRNA) represents a critical class of regulatory RNAs with distinctive structural and functional features. The functions of circRNAs are modulated by various RNA modifications. Here, we present CircRM, a nanopore direct RNA sequencing-based computational method for profiling RNA modifications in circRNAs at single-base and single-molecule resolution. By integrating circRNA detection, read-level modification detection, and quantitative assessment of methylation rates, CircRM identified 427 high-confidence circRNAs and enables systematic characterization of three major modifications, m5C (AUC = 0.855), m6A (AUC = 0.817) and m1A (AUC = 0.769). It revealed distinct modification patterns compared with linear RNAs, highlighting RNA-type-specific regulations. We also identified the key features of circRNA-specific modifications, such as the enrichment near the back-splice junctions. Cross-cell line analyses further demonstrated conserved and cell-type-specific modification patterns. Together, these findings reveal, at the computational level, a unique epitranscriptomic landscape associated with circRNAs and establish CircRM as a powerful tool for advancing the study of RNA modifications in circular RNA biology. CircRM is free accessible at: https://github.com/jiayiAnnie17/CircRM.</p>","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"27 1","pages":""},"PeriodicalIF":7.7,"publicationDate":"2026-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12798809/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145965377","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Briefings in bioinformatics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1