BMC Bioinformatics最新文献_第7页

X-cross/over: a web tool for graph-based estimation of meiotic crossover events in plants. X-cross/over：一个基于图形估计植物减数分裂交叉事件的网络工具。

IF 3.3 3区生物学 Q2 BIOCHEMICAL RESEARCH METHODS

BMC Bioinformatics

Pub Date : 2025-12-08 DOI: 10.1186/s12859-025-06334-7

Szabolcs Makai, Diána Makai, Erika Chonata-Jiménez, Ildikó Karsai, Péter Mikó, Adél Sepsi, András Cseh

Background: Crossovers are essential for genome stability and genetic diversity, yet in plants they occur infrequently, typically restricted to only one to three per chromosome pair. Genotyping approaches such as SNP arrays or genotyping-by-sequencing (GBS) enable high-resolution detection of crossover frequency, a critical step for elucidating the mechanisms that regulate meiotic recombination and for exploiting it in plant breeding. Despite their widespread use and the availability of highly reproducible marker sets, user-friendly tools for reliable recombination analysis remain scarce.

Results: Here we present X-cross/over, a web-based platform that applies a graph-theoretical algorithm to estimate crossover frequencies from SNP datasets in HapMap format. The platform was evaluated using publicly available barley backcross inbred populations and newly developed wheat doubled haploid lines. Across both datasets, X-cross/over detected crossover events with high accuracy and sensitivity, yielding results consistent with published genotyping and cytological analyses. Importantly, the tool produces outcomes comparable to expert analyses while remaining accessible to users without bioinformatics expertise.

Conclusions: X-cross/over provides a consistent and transparent framework for detecting crossover sites and quantifying their frequency. Implemented in a platform-independent environment, the application is freely available at https://insilicolabdesk.atk.kinin.hu , making it a versatile resource for exploring the genetic and epigenetic regulation of meiotic recombination across plant species.

背景：杂交对基因组稳定性和遗传多样性至关重要，但在植物中很少发生，通常每对染色体只有1到3个。SNP阵列或基因分型测序（GBS）等基因分型方法可以实现高分辨率的交叉频率检测，这是阐明减数分裂重组调控机制和在植物育种中利用它的关键步骤。尽管它们的广泛使用和高度可重复的标记集的可用性，用户友好的工具，可靠的重组分析仍然很少。在这里，我们提出了X-cross/over，这是一个基于网络的平台，它应用图理论算法来估计HapMap格式的SNP数据集的交叉频率。利用公开的大麦回交自交系和新开发的小麦双单倍体系对该平台进行了评价。在这两个数据集中，X-cross/over检测到交叉事件具有很高的准确性和灵敏度，产生的结果与已发表的基因分型和细胞学分析一致。重要的是，该工具产生的结果与专家分析相当，同时仍然可供没有生物信息学专业知识的用户使用。结论：X-cross/over为检测交叉位点和量化其频率提供了一致和透明的框架。该应用程序在独立于平台的环境中实现，可在https://insilicolabdesk.atk.kinin.hu上免费获得，使其成为探索植物物种减数分裂重组的遗传和表观遗传调控的多功能资源。

{"title":"X-cross/over: a web tool for graph-based estimation of meiotic crossover events in plants.","authors":"Szabolcs Makai, Diána Makai, Erika Chonata-Jiménez, Ildikó Karsai, Péter Mikó, Adél Sepsi, András Cseh","doi":"10.1186/s12859-025-06334-7","DOIUrl":"10.1186/s12859-025-06334-7","url":null,"abstract":"Background: Crossovers are essential for genome stability and genetic diversity, yet in plants they occur infrequently, typically restricted to only one to three per chromosome pair. Genotyping approaches such as SNP arrays or genotyping-by-sequencing (GBS) enable high-resolution detection of crossover frequency, a critical step for elucidating the mechanisms that regulate meiotic recombination and for exploiting it in plant breeding. Despite their widespread use and the availability of highly reproducible marker sets, user-friendly tools for reliable recombination analysis remain scarce.Results: Here we present X-cross/over, a web-based platform that applies a graph-theoretical algorithm to estimate crossover frequencies from SNP datasets in HapMap format. The platform was evaluated using publicly available barley backcross inbred populations and newly developed wheat doubled haploid lines. Across both datasets, X-cross/over detected crossover events with high accuracy and sensitivity, yielding results consistent with published genotyping and cytological analyses. Importantly, the tool produces outcomes comparable to expert analyses while remaining accessible to users without bioinformatics expertise.Conclusions: X-cross/over provides a consistent and transparent framework for detecting crossover sites and quantifying their frequency. Implemented in a platform-independent environment, the application is freely available at https://insilicolabdesk.atk.kinin.hu , making it a versatile resource for exploring the genetic and epigenetic regulation of meiotic recombination across plant species.","PeriodicalId":8958,"journal":{"name":"BMC Bioinformatics","volume":" ","pages":"21"},"PeriodicalIF":3.3,"publicationDate":"2025-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12849401/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145707288","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

MARTS-DB: a database of mechanisms and reactions of terpene synthases. MARTS-DB：萜类合成酶的机制和反应数据库。

IF 3.3 3区生物学 Q2 BIOCHEMICAL RESEARCH METHODS

BMC Bioinformatics

Pub Date : 2025-12-08 DOI: 10.1186/s12859-025-06341-8

Martin Engst, Martin Brokeš, Tereza Čalounová, Raman Samusevich, Roman Bushuiev, Anton Bushuiev, Ratthachat Chatpatanasiri, Adéla Tajovská, Safa Mert Akmeşe, Milana Perković, Matouš Soldát, Josef Sivic, Tomáš Pluskal

Background: Terpene synthases (TPSs) are enzymes that catalyze some of the most complex reactions in nature-the cyclizations of terpenes, which form the carbon backbones to the largest group of natural products, the terpenoids. On average, more than half of the carbon atoms in a terpene scaffold undergo a change in connectivity or configuration during these enzymatic cascades. Understanding TPS reaction mechanisms remains challenging, often requiring intricate computational modeling and isotopic labelling studies. Moreover, the relationship between TPS sequence and catalytic function is difficult to decipher, while data-driven approaches remain limited due to a lack of comprehensive, high-quality data sources. MAIN: We introduce the Mechanisms And Reactions of Terpene Synthases DataBase (MARTS-DB)-a manually curated, structured, and searchable database that integrates TPS enzymes, the terpenes they produce, and their detailed reaction mechanisms. MARTS-DB includes over 2850 reactions catalyzed by 1432 annotated enzymes from across all domains of life, with reaction mechanisms mapped as stepwise cascades for more than 500 terpenes. Accessible at https://www.marts-db.org , the database provides advanced search functionality and supports full dataset downloads in machine-readable formats. It also encourages community contributions to promote continuous growth.

Conclusion: User-friendly and comprehensive, MARTS-DB enables the systematic exploration of TPS catalysis, opening new avenues for computational analysis and machine learning, as recently demonstrated in the prediction of novel TPSs.

背景：萜烯合成酶（tps）是催化自然界中一些最复杂的反应的酶——萜烯的环化，它形成了最大的天然产物萜类化合物的碳骨架。平均而言，在这些酶级联过程中，萜烯支架中超过一半的碳原子经历连接或结构的改变。了解TPS反应机制仍然具有挑战性，通常需要复杂的计算建模和同位素标记研究。此外，TPS序列与催化功能之间的关系难以破译，而由于缺乏全面、高质量的数据源，数据驱动的方法仍然有限。主要：我们介绍了萜类合成酶的机制和反应数据库(MARTS-DB)-一个人工管理的，结构化的，可搜索的数据库，集成了TPS酶，它们产生的萜烯，以及它们的详细反应机制。MARTS-DB包括2850多个反应，这些反应由1432种带注释的酶催化，这些酶来自生命的所有领域，反应机制被映射为500多种萜烯的逐步级联。该数据库可访问https://www.marts-db.org，提供高级搜索功能，并支持以机器可读格式下载完整的数据集。它还鼓励社区贡献，以促进持续增长。结论：用户友好且全面，MARTS-DB使TPS催化的系统探索成为可能，为计算分析和机器学习开辟了新的途径，正如最近在预测新型TPS中所证明的那样。

{"title":"MARTS-DB: a database of mechanisms and reactions of terpene synthases.","authors":"Martin Engst, Martin Brokeš, Tereza Čalounová, Raman Samusevich, Roman Bushuiev, Anton Bushuiev, Ratthachat Chatpatanasiri, Adéla Tajovská, Safa Mert Akmeşe, Milana Perković, Matouš Soldát, Josef Sivic, Tomáš Pluskal","doi":"10.1186/s12859-025-06341-8","DOIUrl":"10.1186/s12859-025-06341-8","url":null,"abstract":"Background: Terpene synthases (TPSs) are enzymes that catalyze some of the most complex reactions in nature-the cyclizations of terpenes, which form the carbon backbones to the largest group of natural products, the terpenoids. On average, more than half of the carbon atoms in a terpene scaffold undergo a change in connectivity or configuration during these enzymatic cascades. Understanding TPS reaction mechanisms remains challenging, often requiring intricate computational modeling and isotopic labelling studies. Moreover, the relationship between TPS sequence and catalytic function is difficult to decipher, while data-driven approaches remain limited due to a lack of comprehensive, high-quality data sources. MAIN: We introduce the Mechanisms And Reactions of Terpene Synthases DataBase (MARTS-DB)-a manually curated, structured, and searchable database that integrates TPS enzymes, the terpenes they produce, and their detailed reaction mechanisms. MARTS-DB includes over 2850 reactions catalyzed by 1432 annotated enzymes from across all domains of life, with reaction mechanisms mapped as stepwise cascades for more than 500 terpenes. Accessible at https://www.marts-db.org , the database provides advanced search functionality and supports full dataset downloads in machine-readable formats. It also encourages community contributions to promote continuous growth.Conclusion: User-friendly and comprehensive, MARTS-DB enables the systematic exploration of TPS catalysis, opening new avenues for computational analysis and machine learning, as recently demonstrated in the prediction of novel TPSs.","PeriodicalId":8958,"journal":{"name":"BMC Bioinformatics","volume":" ","pages":"10"},"PeriodicalIF":3.3,"publicationDate":"2025-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12797696/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145707303","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

SVhet: towards accurate detection of germline heterozygous deletions using short reads. svheet：利用短读数准确检测种系杂合缺失。

IF 3.3 3区生物学 Q2 BIOCHEMICAL RESEARCH METHODS

BMC Bioinformatics

Pub Date : 2025-12-07 DOI: 10.1186/s12859-025-06342-7

Chun Hing She, Sophelia Hoi-Shan Chan, Wanling Yang

Background: Accurate structural variant detection from short-read sequencing data remains challenged by false positives, particularly for heterozygous deletions where reduced allelic support and coverage-based detection methods are ambiguous. Existing SV genotyping and filtering approaches suffer from significant recall reductions, dependencies on additional pre-computed resources, or restriction to depth-based signals that overlook read level evidence.

Results: Here we present SVhet, a novel computational framework that leverages the heterozygosity patterns detected from different read evidences to identify false heterozygous deletions. Comprehensive benchmarking using 31 Human Genome Structural Variation Consortium Phase 3 samples demonstrated SVhet's ability to further reduce false positives while maintaining baseline recall. Hybrid approach of duphold and SVhet achieved up to 60% reduction in false positive counts while preserving recall. We also showed SVhet to be computationally efficient that can complete a whole genome structural variant callset under 5 min using 4 CPU cores. SVhet is available under a permissive MIT license via https://github.com/snakesch/SVhet .

Conclusion: SVhet provides an accurate and efficient solution for evaluating heterozygous deletions derived from short read sequencing data. SVhet can be used as a standalone tool or in conjunction with other filtering tools such as duphold. Importantly, it does not require additional variant sets, and can operate with minimal compute. Altogether, SVhet adds to the current effort to achieve accurate structural variant detection using short reads.

背景：从短读测序数据中准确检测结构变异仍然受到假阳性的挑战，特别是对于杂合缺失，其中减少的等位基因支持和基于覆盖率的检测方法是模糊的。现有的SV基因分型和过滤方法存在召回率显著降低、依赖于额外的预计算资源、或者对基于深度的信号的限制而忽略了读取水平的证据。结果：在这里，我们提出了svheet，一个新的计算框架，利用从不同的读取证据检测到的杂合模式来识别假杂合缺失。使用31个人类基因组结构变异联盟第三期样本的综合基准测试表明，svheet能够在保持基线召回率的同时进一步减少假阳性。duphold和svheet的混合方法在保留召回率的同时减少了60%的假阳性计数。我们还证明了svet的计算效率，它可以在5分钟内使用4个CPU内核完成全基因组结构变体调用集。SVhet在MIT许可下可通过https://github.com/snakesch/SVhet.Conclusion获得：SVhet为评估来自短读测序数据的杂合缺失提供了准确有效的解决方案。svheet可以作为一个独立的工具使用，也可以与其他过滤工具（如duhold）结合使用。重要的是，它不需要额外的变体集，并且可以用最少的计算进行操作。总之，svet增加了目前使用短读取实现准确结构变异检测的努力。

{"title":"SVhet: towards accurate detection of germline heterozygous deletions using short reads.","authors":"Chun Hing She, Sophelia Hoi-Shan Chan, Wanling Yang","doi":"10.1186/s12859-025-06342-7","DOIUrl":"10.1186/s12859-025-06342-7","url":null,"abstract":"Background: Accurate structural variant detection from short-read sequencing data remains challenged by false positives, particularly for heterozygous deletions where reduced allelic support and coverage-based detection methods are ambiguous. Existing SV genotyping and filtering approaches suffer from significant recall reductions, dependencies on additional pre-computed resources, or restriction to depth-based signals that overlook read level evidence.Results: Here we present SVhet, a novel computational framework that leverages the heterozygosity patterns detected from different read evidences to identify false heterozygous deletions. Comprehensive benchmarking using 31 Human Genome Structural Variation Consortium Phase 3 samples demonstrated SVhet's ability to further reduce false positives while maintaining baseline recall. Hybrid approach of duphold and SVhet achieved up to 60% reduction in false positive counts while preserving recall. We also showed SVhet to be computationally efficient that can complete a whole genome structural variant callset under 5 min using 4 CPU cores. SVhet is available under a permissive MIT license via https://github.com/snakesch/SVhet .Conclusion: SVhet provides an accurate and efficient solution for evaluating heterozygous deletions derived from short read sequencing data. SVhet can be used as a standalone tool or in conjunction with other filtering tools such as duphold. Importantly, it does not require additional variant sets, and can operate with minimal compute. Altogether, SVhet adds to the current effort to achieve accurate structural variant detection using short reads.","PeriodicalId":8958,"journal":{"name":"BMC Bioinformatics","volume":" ","pages":"9"},"PeriodicalIF":3.3,"publicationDate":"2025-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12798059/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145699631","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

GeDi: simplifying gene set distances for enhanced omics interpretation in R/Bioconductor. GeDi：简化基因集距离，增强R/Bioconductor组学解释。

IF 3.3 3区生物学 Q2 BIOCHEMICAL RESEARCH METHODS

BMC Bioinformatics

Pub Date : 2025-12-07 DOI: 10.1186/s12859-025-06335-6

Annekathrin Silvia Nedwed, Arsenij Ustjanzew, Najla Abassi, Leon Dammer, Alicia Schulze, Sara Salome Helbich, Michael Delacher, Konstantin Strauch, Federico Marini

引用次数: 0

A DSSM network for inferring and prioritizing cell-type-specific regulons using single-cell RNA-seq data. 使用单细胞RNA-seq数据推断和优先排序细胞类型特异性调控的DSSM网络。

IF 3.3 3区生物学 Q2 BIOCHEMICAL RESEARCH METHODS

BMC Bioinformatics

Pub Date : 2025-12-07 DOI: 10.1186/s12859-025-06329-4

Yaxin Fan, Yichao Mei, Shengbao Bao, Jianyong Wang, Junxiang Gao

Background: Transcription factors and their target genes form regulatory modules known as regulons, which exhibit significant specificity across various cell types. The integration of single-cell transcriptome data, transcription factor motif data, and ChIP-seq data presents a challenging task in identifying cell-type-specific regulons and examining their activities.

Results: In response, this study presents a Deep Structured Semantic Model for inferring and prioritizing cell-type-specific Regulons (DSSMReg). This approach utilizes single-cell transcriptome and transcription factor motif data to map transcription factors and target genes into a low-dimensional semantic space, resulting in the generation of feature vectors. The model then computes the cosine similarity between transcription factors and target genes to evaluate their regulatory strength and subsequently infers cell-type-specific regulons based on this assessment. Moreover, DSSMReg employs the AUCell algorithm to rank the importance of regulons for each cell type.

Conclusions: We compared DSSMReg against five representative gene regulatory inference algorithms using scRNA-seq data from five cell lines, with DSSMReg achieving the highest evaluation metrics for both AUROC and AUPRC. Furthermore, we applied DSSMReg to infer cell-type-specific regulons from scRNA-seq data of triple-negative breast cancer and human bone marrow hematopoietic stem cells. Our results indicated that regulons with high AUCell scores possess significant biological relevance. The source code of DSSMReg is freely available at https://github.com/YaxinF/DSSMReg .

背景：转录因子及其靶基因形成调控模块，称为调控子，在不同的细胞类型中表现出显著的特异性。单细胞转录组数据、转录因子基序数据和ChIP-seq数据的整合在识别细胞类型特异性调控和检查其活性方面提出了一项具有挑战性的任务。作为回应，本研究提出了一个用于推断和优先排序细胞类型特异性规则的深度结构化语义模型（DSSMReg）。该方法利用单细胞转录组和转录因子基序数据将转录因子和靶基因映射到低维语义空间中，从而生成特征向量。然后，该模型计算转录因子和靶基因之间的余弦相似性，以评估其调控强度，并随后根据该评估推断出细胞类型特异性的调控。此外，dssmregg采用AUCell算法对每种细胞类型的规则重要性进行排序。结论：我们使用来自5个细胞系的scRNA-seq数据，将DSSMReg与5种代表性基因调控推断算法进行了比较，DSSMReg在AUROC和AUPRC中都获得了最高的评价指标。此外，我们利用DSSMReg从三阴性乳腺癌和人骨髓造血干细胞的scRNA-seq数据中推断出细胞类型特异性调控。我们的研究结果表明，高AUCell评分的调控具有显著的生物学相关性。DSSMReg的源代码可以在https://github.com/YaxinF/DSSMReg上免费获得。

{"title":"A DSSM network for inferring and prioritizing cell-type-specific regulons using single-cell RNA-seq data.","authors":"Yaxin Fan, Yichao Mei, Shengbao Bao, Jianyong Wang, Junxiang Gao","doi":"10.1186/s12859-025-06329-4","DOIUrl":"10.1186/s12859-025-06329-4","url":null,"abstract":"Background: Transcription factors and their target genes form regulatory modules known as regulons, which exhibit significant specificity across various cell types. The integration of single-cell transcriptome data, transcription factor motif data, and ChIP-seq data presents a challenging task in identifying cell-type-specific regulons and examining their activities.Results: In response, this study presents a Deep Structured Semantic Model for inferring and prioritizing cell-type-specific Regulons (DSSMReg). This approach utilizes single-cell transcriptome and transcription factor motif data to map transcription factors and target genes into a low-dimensional semantic space, resulting in the generation of feature vectors. The model then computes the cosine similarity between transcription factors and target genes to evaluate their regulatory strength and subsequently infers cell-type-specific regulons based on this assessment. Moreover, DSSMReg employs the AUCell algorithm to rank the importance of regulons for each cell type.Conclusions: We compared DSSMReg against five representative gene regulatory inference algorithms using scRNA-seq data from five cell lines, with DSSMReg achieving the highest evaluation metrics for both AUROC and AUPRC. Furthermore, we applied DSSMReg to infer cell-type-specific regulons from scRNA-seq data of triple-negative breast cancer and human bone marrow hematopoietic stem cells. Our results indicated that regulons with high AUCell scores possess significant biological relevance. The source code of DSSMReg is freely available at https://github.com/YaxinF/DSSMReg .","PeriodicalId":8958,"journal":{"name":"BMC Bioinformatics","volume":" ","pages":"8"},"PeriodicalIF":3.3,"publicationDate":"2025-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12798040/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145699660","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

easyClock: a user-friendly desktop application for circadian rhythm analysis and visualization. easyClock：一个用户友好的桌面应用程序，用于昼夜节律分析和可视化。

IF 3.3 3区生物学 Q2 BIOCHEMICAL RESEARCH METHODS

BMC Bioinformatics

Pub Date : 2025-12-05 DOI: 10.1186/s12859-025-06340-9

Binbin Wu, William W Ja

Circadian rhythms regulate a wide range of biological processes, and their precise characterization is essential for understanding behavioral and physiological fluctuations. However, existing tools to analyze circadian data often require coding expertise or rely on specific data acquisition software, limiting their general applicability. Here, we present easyClock, an intuitive and interactive application designed to streamline circadian rhythm analysis and visualization. The easyClock application enables simultaneous processing of multiple files, allowing users to batch-analyze and visualize diverse sets of time series data. To enhance data analysis efficiency and provide comparable results, this application integrates comprehensive methods for handling data with various waveforms and noises. Additionally, easyClock can assess inter-individual variability and group differences using linear mixed-effects modeling. All statistical results and graphs are easily viewed and exported for any selected range of data. As a demonstration, we present a re-analysis of a time-series transcriptomic dataset, highlighting the value of easyClock as an accessible, open-source tool. This easy-to-use application requires no programming expertise and can be directly installed on Windows and macOS machines in a single step.

昼夜节律调节着广泛的生物过程，它们的精确表征对于理解行为和生理波动至关重要。然而，分析昼夜节律数据的现有工具通常需要编码专业知识或依赖特定的数据采集软件，限制了它们的普遍适用性。在这里，我们介绍easyClock，一个直观的交互式应用程序，旨在简化昼夜节律分析和可视化。easyClock应用程序可以同时处理多个文件，允许用户批量分析和可视化不同的时间序列数据集。为了提高数据分析效率并提供可比较的结果，本应用程序集成了处理各种波形和噪声数据的综合方法。此外，easyClock可以使用线性混合效应建模来评估个体间变异性和群体差异。所有统计结果和图表都很容易查看和导出任何选定的数据范围。作为演示，我们对时间序列转录组数据集进行了重新分析，突出了easyClock作为可访问的开源工具的价值。这个易于使用的应用程序不需要编程专业知识，可以直接安装在Windows和macOS机器上，只需一步。

{"title":"easyClock: a user-friendly desktop application for circadian rhythm analysis and visualization.","authors":"Binbin Wu, William W Ja","doi":"10.1186/s12859-025-06340-9","DOIUrl":"10.1186/s12859-025-06340-9","url":null,"abstract":"Circadian rhythms regulate a wide range of biological processes, and their precise characterization is essential for understanding behavioral and physiological fluctuations. However, existing tools to analyze circadian data often require coding expertise or rely on specific data acquisition software, limiting their general applicability. Here, we present easyClock, an intuitive and interactive application designed to streamline circadian rhythm analysis and visualization. The easyClock application enables simultaneous processing of multiple files, allowing users to batch-analyze and visualize diverse sets of time series data. To enhance data analysis efficiency and provide comparable results, this application integrates comprehensive methods for handling data with various waveforms and noises. Additionally, easyClock can assess inter-individual variability and group differences using linear mixed-effects modeling. All statistical results and graphs are easily viewed and exported for any selected range of data. As a demonstration, we present a re-analysis of a time-series transcriptomic dataset, highlighting the value of easyClock as an accessible, open-source tool. This easy-to-use application requires no programming expertise and can be directly installed on Windows and macOS machines in a single step.","PeriodicalId":8958,"journal":{"name":"BMC Bioinformatics","volume":" ","pages":"7"},"PeriodicalIF":3.3,"publicationDate":"2025-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12797594/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145686715","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

LOSTdb: a manually curated multi-omics database for lung cancer research. LOSTdb：一个用于肺癌研究的人工多组学数据库。

IF 3.3 3区生物学 Q2 BIOCHEMICAL RESEARCH METHODS

BMC Bioinformatics

Pub Date : 2025-12-03 DOI: 10.1186/s12859-025-06319-6

Hao Luo, Yunhao Yang, Zhipeng Gong, Lunxu Liu, Yaohui Chen

Lung cancer is one of the most prevalent malignant tumors with high morbidity and mortality rates worldwide. Extensive multi-omics analyses have revealed significant intratumoral heterogeneity even within the same histopathological subtype. However, a database that systematically integrates multi-omics data for lung cancer research has long been lacking. Here, we developed LOSTdb, a molecular subtype annotation system for lung cancer that integrates multi-omics data and metadata. LOSTdb comprises 295 multi-omics datasets, including bulk RNA-seq, genomic, proteomic, methylation, and scRNA-seq data, with over 10,000 manually curated metadata entries. This resource encompasses high-quality clinical specimens, mouse models, and cell lines, totaling 34,393 samples and more than 1.2 million single cells. Each omics sample was annotated with both literature-based classical subtypes and NMF-derived meta-program (MP) subtypes. The platform supports cross-searching of omics and metadata at the gene and dataset levels, offers multiple visualization and analysis methods, and includes five tool modules, enabling functions such as integrated analysis, significance analysis between metadata as well as between genes and metadata, and target prediction for lung cancer molecular subtypes, serving as an essential tool for lung cancer precision medicine. LOSTdb is a user-friendly interactive database freely accessible at http://lostdbcancer.com:8080 .

肺癌是世界上发病率和死亡率最高的恶性肿瘤之一。广泛的多组学分析显示，即使在相同的组织病理学亚型中，肿瘤内也存在显著的异质性。然而，长期以来一直缺乏一个系统地整合肺癌研究多组学数据的数据库。在这里，我们开发了LOSTdb，一个整合了多组学数据和元数据的肺癌分子亚型注释系统。LOSTdb包括295个多组学数据集，包括大量RNA-seq、基因组、蛋白质组学、甲基化和scRNA-seq数据，以及超过10,000个手动管理的元数据条目。该资源包括高质量的临床标本、小鼠模型和细胞系，总计34,393个样本和120多万个单细胞。每个组学样本都用基于文献的经典亚型和nmf衍生的元程序（MP）亚型进行注释。该平台支持组学和元数据在基因和数据集层面的交叉检索，提供多种可视化和分析方法，包括五大工具模块，实现了集成分析、元数据之间、基因与元数据之间的显著性分析、肺癌分子亚型靶点预测等功能，是肺癌精准医疗的重要工具。LOSTdb是一个用户友好的交互式数据库，可在http://lostdbcancer.com:8080免费访问。

{"title":"LOSTdb: a manually curated multi-omics database for lung cancer research.","authors":"Hao Luo, Yunhao Yang, Zhipeng Gong, Lunxu Liu, Yaohui Chen","doi":"10.1186/s12859-025-06319-6","DOIUrl":"10.1186/s12859-025-06319-6","url":null,"abstract":"Lung cancer is one of the most prevalent malignant tumors with high morbidity and mortality rates worldwide. Extensive multi-omics analyses have revealed significant intratumoral heterogeneity even within the same histopathological subtype. However, a database that systematically integrates multi-omics data for lung cancer research has long been lacking. Here, we developed LOSTdb, a molecular subtype annotation system for lung cancer that integrates multi-omics data and metadata. LOSTdb comprises 295 multi-omics datasets, including bulk RNA-seq, genomic, proteomic, methylation, and scRNA-seq data, with over 10,000 manually curated metadata entries. This resource encompasses high-quality clinical specimens, mouse models, and cell lines, totaling 34,393 samples and more than 1.2 million single cells. Each omics sample was annotated with both literature-based classical subtypes and NMF-derived meta-program (MP) subtypes. The platform supports cross-searching of omics and metadata at the gene and dataset levels, offers multiple visualization and analysis methods, and includes five tool modules, enabling functions such as integrated analysis, significance analysis between metadata as well as between genes and metadata, and target prediction for lung cancer molecular subtypes, serving as an essential tool for lung cancer precision medicine. LOSTdb is a user-friendly interactive database freely accessible at http://lostdbcancer.com:8080 .","PeriodicalId":8958,"journal":{"name":"BMC Bioinformatics","volume":"26 1","pages":"290"},"PeriodicalIF":3.3,"publicationDate":"2025-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12676782/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145666846","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Haplotype-based autoencoders can reduce the dataset dimension and estimate haplotype block effects in different crop species. 基于单倍型的自编码器可以降低数据集的维数并估计不同作物物种的单倍型块效应。

IF 3.3 3区生物学 Q2 BIOCHEMICAL RESEARCH METHODS

BMC Bioinformatics

Pub Date : 2025-12-02 DOI: 10.1186/s12859-025-06323-w

Philipp Georg Heilmann, Emanuel Grosch, Matthias Frisch, Matthias Herrmann, Steffen Beuch, Vivek Kurra, Martin Mascher, Raz Avni, Klaus Oldach, Ina Röhrs, Anja Hanemann, Raja Ram Mehta, Carsten Reinbrecht, Albrecht Serfling, Andreas Stahl, Marco Stucke, Amine Abbadi, Tobias Kox, Carola Zenke-Philippi

引用次数: 0

A parametric survival model with bayesian structural equation based on multi-omics integration. 基于多组学集成的贝叶斯结构方程参数化生存模型。

IF 3.3 3区生物学 Q2 BIOCHEMICAL RESEARCH METHODS

BMC Bioinformatics

Pub Date : 2025-11-29 DOI: 10.1186/s12859-025-06338-3

Jiadong Chu, Yu Wang, Na Sun, Qiang Han, Ziqing Sun, Mengtong Sun, Yuheng Yuan, Qida He, Yueping Shen

Background: Multi-omics integration may provide additional information about the development of tumors and improve the performance of predictive models. The key challenge lies in integrating several omics sources, especially to capture their biological relationships. Previous studies proposed a structural equation model framework to combine two data platforms for predicting survival; however, several limitations remain.

Results: In this study, we introduce an extended Bayesian survival model combined with a structural equation model for adaptation to broader applications. The No U-turn Sampling (NUTS) algorithm was utilized to efficiently sample the posterior distribution of model parameters. Through a series of simulation studies, our model showed excellent goodness-of-fit and predictive performance. To validate the efficiency of our model, we utilized a gastric cancer dataset with three omics types (mRNA, microRNA, and methylation) obtained from The Cancer Genome Atlas. After bioinformatic processing, we included six mRNA, microRNA, and methylation loci datasets into the framework and discovered that our model exhibited greater predictive performance compared to non-integrated and Integrative Bayesian Analysis of Genomics (iBAG) models.

Conclusions: In conclusion, our extended Bayesian structural equation model for multi-omics survival analysis provides a robust framework that significantly enhances predictive accuracy by effectively capturing complex biological relationships across diverse omics data sources, demonstrating clear advantages over both non-integrated approaches and existing integrative methods like iBAG.

背景：多组学整合可以提供关于肿瘤发展的额外信息，并提高预测模型的性能。关键的挑战在于整合几个组学来源，特别是捕捉它们的生物学关系。先前的研究提出了一个结构方程模型框架来结合两个数据平台来预测生存；然而，仍然存在一些限制。结果：在本研究中，我们引入了一个扩展的贝叶斯生存模型与结构方程模型相结合，以适应更广泛的应用。采用无U-turn采样（NUTS）算法对模型参数的后验分布进行有效采样。通过一系列的仿真研究，我们的模型具有良好的拟合优度和预测性能。为了验证我们模型的有效性，我们使用了从癌症基因组图谱中获得的具有三种组学类型（mRNA， microRNA和甲基化）的胃癌数据集。经过生物信息学处理，我们将6个mRNA、microRNA和甲基化位点数据集纳入框架，并发现我们的模型与非集成和集成贝叶斯基因组学分析（iBAG）模型相比具有更高的预测性能。结论：我们的扩展贝叶斯结构方程模型为多组学生存分析提供了一个强大的框架，通过有效地捕获不同组学数据源中的复杂生物关系，显著提高了预测准确性，比非集成方法和现有的集成方法（如iBAG）都有明显的优势。

{"title":"A parametric survival model with bayesian structural equation based on multi-omics integration.","authors":"Jiadong Chu, Yu Wang, Na Sun, Qiang Han, Ziqing Sun, Mengtong Sun, Yuheng Yuan, Qida He, Yueping Shen","doi":"10.1186/s12859-025-06338-3","DOIUrl":"10.1186/s12859-025-06338-3","url":null,"abstract":"Background: Multi-omics integration may provide additional information about the development of tumors and improve the performance of predictive models. The key challenge lies in integrating several omics sources, especially to capture their biological relationships. Previous studies proposed a structural equation model framework to combine two data platforms for predicting survival; however, several limitations remain.Results: In this study, we introduce an extended Bayesian survival model combined with a structural equation model for adaptation to broader applications. The No U-turn Sampling (NUTS) algorithm was utilized to efficiently sample the posterior distribution of model parameters. Through a series of simulation studies, our model showed excellent goodness-of-fit and predictive performance. To validate the efficiency of our model, we utilized a gastric cancer dataset with three omics types (mRNA, microRNA, and methylation) obtained from The Cancer Genome Atlas. After bioinformatic processing, we included six mRNA, microRNA, and methylation loci datasets into the framework and discovered that our model exhibited greater predictive performance compared to non-integrated and Integrative Bayesian Analysis of Genomics (iBAG) models.Conclusions: In conclusion, our extended Bayesian structural equation model for multi-omics survival analysis provides a robust framework that significantly enhances predictive accuracy by effectively capturing complex biological relationships across diverse omics data sources, demonstrating clear advantages over both non-integrated approaches and existing integrative methods like iBAG.","PeriodicalId":8958,"journal":{"name":"BMC Bioinformatics","volume":" ","pages":"3"},"PeriodicalIF":3.3,"publicationDate":"2025-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12771807/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145628935","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

ProTrack3D: a comprehensive tool for segmentation and tracking of proteins with split and fusion. Protrack3d：一个全面的工具，分割和跟踪与分裂和融合的蛋白质。

IF 3.3 3区生物学 Q2 BIOCHEMICAL RESEARCH METHODS

BMC Bioinformatics

Pub Date : 2025-11-28 DOI: 10.1186/s12859-025-06307-w

Ramu Gautam, Yang Jiao, Yasong Pang, Mo Weng, Mei Yang

引用次数: 0