首页 > 最新文献

Bioinformatics advances最新文献

英文 中文
Bridging worlds: connecting glycan representations with glycoinformatics via Universal Input and a canonicalized nomenclature. 桥接世界:通过通用输入和规范化命名法将糖信息学与糖聚糖表示连接起来。
IF 2.8 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-12-01 eCollection Date: 2025-01-01 DOI: 10.1093/bioadv/vbaf310
James Urban, Roman Joeres, Daniel Bojar

Motivation: As the field of glycobiology has developed, so too have different glycan nomenclature systems. While each system serves specific purposes, this multiplicity creates challenges for usability, data integration, and knowledge sharing across different databases and computational tools.

Results: We present a practical framework for automated nomenclature conversion that takes any glycan nomenclature as input without requiring declaration of the specific language and outputs a canonicalized IUPAC-condensed format as a standardized representation. Our implementation handles all common nomenclatures including WURCS, GlycoCT, IUPAC-condensed/extended, GLYCAM, CSDB-linear, LinearCode, GlycoWorkbench, GlySeeker, Oxford, and KCF, along with common typos, and manages complex cases including structural ambiguities, modifications, uncertainty in linkage information, and different compositional representations. This Universal Input framework can translate more than 10 nomenclatures in <1 ms per glycan, tested on over 150 000 sequences with 98%-100% coverage, enabling seamless integration of existing glycan databases and tools while maintaining the specific advantages of each representation system.

Availability and implementation: Universal Input is implemented within the glycowork Python package, available at https://github.com/BojarLab/glycowork and our web app https://canonicalize.streamlit.app/.

动机:随着糖生物学领域的发展,不同的糖命名系统也随之发展。虽然每个系统都有特定的用途,但这种多样性给可用性、数据集成和跨不同数据库和计算工具的知识共享带来了挑战。结果:我们提出了一个实用的自动命名法转换框架,它将任何聚糖命名法作为输入,而不需要声明特定的语言,并输出规范化的iupac压缩格式作为标准化表示。我们的实现处理所有常见的命名,包括WURCS、glyct、IUPAC-condensed/extended、GLYCAM、CSDB-linear、LinearCode、GlycoWorkbench、GlySeeker、Oxford和KCF,以及常见的拼写错误,并管理复杂的情况,包括结构歧义、修改、链接信息的不确定性和不同的组成表示。这个通用输入框架可以翻译可用性和实现中的10多个术语:通用输入在糖work Python包中实现,可在https://github.com/BojarLab/glycowork和我们的web应用程序https://canonicalize.streamlit.app/中获得。
{"title":"Bridging worlds: connecting glycan representations with glycoinformatics via Universal Input and a canonicalized nomenclature.","authors":"James Urban, Roman Joeres, Daniel Bojar","doi":"10.1093/bioadv/vbaf310","DOIUrl":"10.1093/bioadv/vbaf310","url":null,"abstract":"<p><strong>Motivation: </strong>As the field of glycobiology has developed, so too have different glycan nomenclature systems. While each system serves specific purposes, this multiplicity creates challenges for usability, data integration, and knowledge sharing across different databases and computational tools.</p><p><strong>Results: </strong>We present a practical framework for automated nomenclature conversion that takes any glycan nomenclature as input without requiring declaration of the specific language and outputs a canonicalized IUPAC-condensed format as a standardized representation. Our implementation handles all common nomenclatures including WURCS, GlycoCT, IUPAC-condensed/extended, GLYCAM, CSDB-linear, LinearCode, GlycoWorkbench, GlySeeker, Oxford, and KCF, along with common typos, and manages complex cases including structural ambiguities, modifications, uncertainty in linkage information, and different compositional representations. This Universal Input framework can translate more than 10 nomenclatures in <1 ms per glycan, tested on over 150 000 sequences with 98%-100% coverage, enabling seamless integration of existing glycan databases and tools while maintaining the specific advantages of each representation system.</p><p><strong>Availability and implementation: </strong>Universal Input is implemented within the glycowork Python package, available at https://github.com/BojarLab/glycowork and our web app https://canonicalize.streamlit.app/.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"5 1","pages":"vbaf310"},"PeriodicalIF":2.8,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12702141/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145764168","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Omics BioAnalytics: an RShiny application for multimodal biomarker panel discovery and assessment. 组学生物分析:多模态生物标志物面板发现和评估的RShiny应用。
IF 2.8 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-11-27 eCollection Date: 2026-01-01 DOI: 10.1093/bioadv/vbaf307
Josh Dyce, Lea Rieskamp, Scott J Tebbutt, Bruce M McManus, Amrit Singh

Motivation: Machine learning offers a powerful approach for building predictive models from high-dimensional molecular data. Omics technologies such as transcriptomics, proteomics, and metabolomics quantify thousands of molecules simultaneously, providing deep insights into disease biology. Integrating multiple modalities can enhance predictive performance, as shown in histology-omics and holter-omics applications. To support streamlined, reproducible, and user-friendly multimodal analytics, we developed Omics BioAnalytics, an R Shiny platform for unified analysis, integration, and interpretation of diverse omics datasets.

Results: Omics BioAnalytics performs late integration using ensembles of elastic net models trained independently on each modality, with predictions averaged across datasets. The platform provides interactive dashboards for metadata exploration, exploratory analyses, differential expression, gene set analysis, and biomarker discovery. Results are visualized through dynamic plots and downloadable reports, ensuring transparent and reproducible workflows. A unique feature is the integrated multimodal Alexa Skill, which enables voice-based querying and rapid visualization. Together, these web and voice-enabled tools offer accessible and reproducible multimodal analytics for biomedical researchers, supporting the discovery of molecular signatures, predictive biomarkers, and therapeutic targets.

Availability and implementation: All source code, public datasets, video walkthroughs, and the deployed application are available at: https://github.com/CompBio-Lab/omicsBioAnalytics.

动机:机器学习为从高维分子数据中构建预测模型提供了一种强大的方法。组学技术,如转录组学、蛋白质组学和代谢组学,可以同时量化数千个分子,为疾病生物学提供深入的见解。正如在组织学组学和活体组学应用中所显示的那样,整合多种模式可以提高预测性能。为了支持简化、可重复和用户友好的多模式分析,我们开发了Omics BioAnalytics,这是一个R Shiny平台,用于统一分析、集成和解释各种组学数据集。结果:Omics BioAnalytics使用在每个模态上独立训练的弹性网络模型集合执行后期集成,并在数据集上平均预测。该平台为元数据探索、探索性分析、差异表达、基因集分析和生物标志物发现提供了交互式仪表板。结果通过动态图表和可下载的报告可视化,确保透明和可重复的工作流程。一个独特的功能是集成的多模式Alexa技能,它支持基于语音的查询和快速可视化。总之,这些网络和语音工具为生物医学研究人员提供了可访问和可重复的多模态分析,支持发现分子特征、预测性生物标志物和治疗靶点。可用性和实现:所有源代码、公共数据集、视频演练和部署的应用程序都可以在:https://github.com/CompBio-Lab/omicsBioAnalytics上获得。
{"title":"Omics BioAnalytics: an RShiny application for multimodal biomarker panel discovery and assessment.","authors":"Josh Dyce, Lea Rieskamp, Scott J Tebbutt, Bruce M McManus, Amrit Singh","doi":"10.1093/bioadv/vbaf307","DOIUrl":"10.1093/bioadv/vbaf307","url":null,"abstract":"<p><strong>Motivation: </strong>Machine learning offers a powerful approach for building predictive models from high-dimensional molecular data. Omics technologies such as transcriptomics, proteomics, and metabolomics quantify thousands of molecules simultaneously, providing deep insights into disease biology. Integrating multiple modalities can enhance predictive performance, as shown in histology-omics and holter-omics applications. To support streamlined, reproducible, and user-friendly multimodal analytics, we developed Omics BioAnalytics, an R Shiny platform for unified analysis, integration, and interpretation of diverse omics datasets.</p><p><strong>Results: </strong>Omics BioAnalytics performs late integration using ensembles of elastic net models trained independently on each modality, with predictions averaged across datasets. The platform provides interactive dashboards for metadata exploration, exploratory analyses, differential expression, gene set analysis, and biomarker discovery. Results are visualized through dynamic plots and downloadable reports, ensuring transparent and reproducible workflows. A unique feature is the integrated multimodal Alexa Skill, which enables voice-based querying and rapid visualization. Together, these web and voice-enabled tools offer accessible and reproducible multimodal analytics for biomedical researchers, supporting the discovery of molecular signatures, predictive biomarkers, and therapeutic targets.</p><p><strong>Availability and implementation: </strong>All source code, public datasets, video walkthroughs, and the deployed application are available at: https://github.com/CompBio-Lab/omicsBioAnalytics.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"6 1","pages":"vbaf307"},"PeriodicalIF":2.8,"publicationDate":"2025-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12782103/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145954080","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Integrative analysis and imputation of multiple data streams via deep Gaussian processes. 基于深度高斯过程的多数据流综合分析与输入。
IF 2.8 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-11-27 eCollection Date: 2026-01-01 DOI: 10.1093/bioadv/vbaf305
Ali A Septiandri, Deyu Ming, Francisco Alejandro DiazDelaO, Takoua Jendoubi, Samiran Ray

Motivation: Healthcare data, particularly in critical care settings, presents three key challenges for analysis. First, physiological measurements come from different sources but are inherently related. Yet, traditional methods often treat each measurement type independently, losing valuable information about their relationships. Second, clinical measurements are collected at irregular intervals, and these sampling times can carry clinical meaning. Finally, the prevalence of missing values. Whilst several imputation methods exist to tackle this common problem, they often fail to address the temporal nature of the data or provide estimates of uncertainty in their predictions.

Results: We propose using deep Gaussian process emulation with stochastic imputation, a methodology initially conceived to deal with computationally expensive models and uncertainty quantification, to solve the problem of handling missing values that naturally occur in critical care data. This method leverages longitudinal and cross-sectional information and provides uncertainty estimation for the imputed values. Our evaluation of a clinical dataset shows that the proposed method performs better than conventional methods, such as multiple imputations with chained equations (MICE), last-known value imputation, and individually fitted Gaussian processes (GPs).

Availability and implementation: The source code of the experiments is freely available at: https://github.com/aliakbars/dgpsi-picu.

动机:医疗保健数据,特别是在重症监护环境中,对分析提出了三个关键挑战。首先,生理测量来自不同的来源,但内在相关。然而,传统的方法经常独立地处理每种测量类型,从而丢失了关于它们之间关系的有价值的信息。第二,临床测量是不定期采集的,这些采样时间可以携带临床意义。最后,缺失值的普遍性。虽然有几种估算方法可以解决这一常见问题,但它们往往无法解决数据的时间性质,也无法在预测中提供不确定性的估计。结果:我们建议使用深度高斯过程仿真与随机imputation,一种最初设想的方法来处理计算昂贵的模型和不确定性量化,以解决在重症监护数据中自然出现的缺失值处理问题。该方法利用纵向和横截面信息,并为输入值提供不确定性估计。我们对临床数据集的评估表明,所提出的方法优于传统方法,如链式方程(MICE)的多次imputation,最后已知值imputation和单独拟合的高斯过程(GPs)。可用性和实现:实验的源代码可在https://github.com/aliakbars/dgpsi-picu免费获得。
{"title":"Integrative analysis and imputation of multiple data streams via deep Gaussian processes.","authors":"Ali A Septiandri, Deyu Ming, Francisco Alejandro DiazDelaO, Takoua Jendoubi, Samiran Ray","doi":"10.1093/bioadv/vbaf305","DOIUrl":"10.1093/bioadv/vbaf305","url":null,"abstract":"<p><strong>Motivation: </strong>Healthcare data, particularly in critical care settings, presents three key challenges for analysis. First, physiological measurements come from different sources but are inherently related. Yet, traditional methods often treat each measurement type independently, losing valuable information about their relationships. Second, clinical measurements are collected at irregular intervals, and these sampling times can carry clinical meaning. Finally, the prevalence of missing values. Whilst several imputation methods exist to tackle this common problem, they often fail to address the temporal nature of the data or provide estimates of uncertainty in their predictions.</p><p><strong>Results: </strong>We propose using deep Gaussian process emulation with stochastic imputation, a methodology initially conceived to deal with computationally expensive models and uncertainty quantification, to solve the problem of handling missing values that naturally occur in critical care data. This method leverages longitudinal and cross-sectional information and provides uncertainty estimation for the imputed values. Our evaluation of a clinical dataset shows that the proposed method performs better than conventional methods, such as multiple imputations with chained equations (MICE), last-known value imputation, and individually fitted Gaussian processes (GPs).</p><p><strong>Availability and implementation: </strong>The source code of the experiments is freely available at: https://github.com/aliakbars/dgpsi-picu.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"6 1","pages":"vbaf305"},"PeriodicalIF":2.8,"publicationDate":"2025-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12776352/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145936562","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Prompt-based bioinformatic pipeline generation for a multi-step metaviral workflow. 多步骤元病毒工作流中基于提示的生物信息学管道生成。
IF 2.8 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-11-27 eCollection Date: 2026-01-01 DOI: 10.1093/bioadv/vbaf308
Pengchong Ma, Haoze Zheng, Weijun Yi, Li Ma, Brandi Sigmon, Karrie A Weber, Gangqing Hu, Qiuming Yao

Motivation: The rapid evolution of bioinformatics tools and multi-step analytic procedure presents a challenge for building effective pipelines, particularly for researchers without extensive programming expertise. This study demonstrates that large language models (LLMs) hold strong potential for generating end-to-end bioinformatic pipelines through carefully crafted prompts, using a multi-step metaviral workflow as a representative example. Multiple LLMs were tested for their effectiveness, including OpenAI ChatGPT series, Anthropic Claude series, Google Gemini, Meta Llama, and DeepSeek.

Results: Our results show that ChatGPT-4, ChatGPT-5, Claude 4.5, and Gemini 2.5 consistently outperform other LLMs in generating complete bioinformatic pipelines, with statistically significant success rates. These models also handle tool substitutions effectively. Simple prompt engineering and the inclusion of official documentation further enhance performance, especially for newer bioinformatic tools. While capabilities vary, all LLMs tested show potential for both pipeline generation and updates with our designed prompts and strategies.

Availability and implementation: All prompts are available in the paper. The examples are available in GitHub https://github.com/mpckkk/pBio.

动机:生物信息学工具和多步骤分析程序的快速发展对建立有效的管道提出了挑战,特别是对于没有广泛编程专业知识的研究人员。本研究表明,大型语言模型(llm)在通过精心制作的提示生成端到端生物信息管道方面具有强大的潜力,并以多步骤元病毒工作流为代表。我们测试了多个llm的有效性,包括OpenAI ChatGPT系列、Anthropic Claude系列、谷歌Gemini、Meta Llama和DeepSeek。结果:我们的研究结果表明,ChatGPT-4、ChatGPT-5、Claude 4.5和Gemini 2.5在生成完整的生物信息学管道方面始终优于其他llm,成功率具有统计学意义。这些模型还可以有效地处理工具替换。简单的快速工程和包含官方文档进一步提高了性能,特别是对于较新的生物信息学工具。虽然功能各不相同,但所有测试的llm都显示出管道生成和更新的潜力。可用性和实现:所有提示都在论文中可用。示例可在GitHub https://github.com/mpckkk/pBio中获得。
{"title":"Prompt-based bioinformatic pipeline generation for a multi-step metaviral workflow.","authors":"Pengchong Ma, Haoze Zheng, Weijun Yi, Li Ma, Brandi Sigmon, Karrie A Weber, Gangqing Hu, Qiuming Yao","doi":"10.1093/bioadv/vbaf308","DOIUrl":"10.1093/bioadv/vbaf308","url":null,"abstract":"<p><strong>Motivation: </strong>The rapid evolution of bioinformatics tools and multi-step analytic procedure presents a challenge for building effective pipelines, particularly for researchers without extensive programming expertise. This study demonstrates that large language models (LLMs) hold strong potential for generating end-to-end bioinformatic pipelines through carefully crafted prompts, using a multi-step metaviral workflow as a representative example. Multiple LLMs were tested for their effectiveness, including OpenAI ChatGPT series, Anthropic Claude series, Google Gemini, Meta Llama, and DeepSeek.</p><p><strong>Results: </strong>Our results show that ChatGPT-4, ChatGPT-5, Claude 4.5, and Gemini 2.5 consistently outperform other LLMs in generating complete bioinformatic pipelines, with statistically significant success rates. These models also handle tool substitutions effectively. Simple prompt engineering and the inclusion of official documentation further enhance performance, especially for newer bioinformatic tools. While capabilities vary, all LLMs tested show potential for both pipeline generation and updates with our designed prompts and strategies.</p><p><strong>Availability and implementation: </strong>All prompts are available in the paper. The examples are available in GitHub https://github.com/mpckkk/pBio.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"6 1","pages":"vbaf308"},"PeriodicalIF":2.8,"publicationDate":"2025-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12782108/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145954028","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Colony: a framework for reproducible and easy-to-use data analysis pipelines for biomedical research with singularity containers. Colony:一个用于生物医学研究的可重复且易于使用的数据分析管道的框架。
IF 2.8 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-11-26 eCollection Date: 2026-01-01 DOI: 10.1093/bioadv/vbaf304
Sebastian Eschner, Mohammad Alabdullah, Martin Dugas

Summary: Bioinformatics pipelines should meet the FAIR criteria to enable reproducible analysis. FAIR describes four key requirements for reproducible research: findability, accessibility, interoperability and reusability. Software containers such as Singularity are widely used tools that facilitate the reuse of software across different computing environments. However, many biologists and other researchers find command line tools such as Singularity unfamiliar and do not feel productive when using software via the command line. We present a graphical user interface that allows biologists without programming experience to interact with containerized software. We evaluate the feasibility of our approach with software used at the TRR156.

Availability and implementation: Colony can be freely downloaded on its project page: https://clipc-jpg.github.io/ColonyWebsite/. The Colony launcher's code is MIT-licensed and freely available at: https://github.com/clipc-jpg/Colony. All related assets can be found at: https://doi.org/10.7910/DVN/Z3OTWY.

摘要:生物信息学管道应满足FAIR标准,以实现可重复性分析。FAIR描述了可重复研究的四个关键要求:可查找性、可访问性、互操作性和可重用性。像Singularity这样的软件容器是广泛使用的工具,它促进了软件在不同计算环境中的重用。然而,许多生物学家和其他研究人员发现像Singularity这样的命令行工具并不熟悉,并且在通过命令行使用软件时感觉效率不高。我们提出了一个图形用户界面,允许没有编程经验的生物学家与容器化软件进行交互。我们用TRR156使用的软件评估了我们方法的可行性。可用性和实现:Colony可以在其项目页面上免费下载:https://clipc-jpg.github.io/ColonyWebsite/。殖民地启动程序的代码是麻省理工学院授权的,可以在https://github.com/clipc-jpg/Colony免费获得。所有相关资产可在https://doi.org/10.7910/DVN/Z3OTWY找到。
{"title":"Colony: a framework for reproducible and easy-to-use data analysis pipelines for biomedical research with singularity containers.","authors":"Sebastian Eschner, Mohammad Alabdullah, Martin Dugas","doi":"10.1093/bioadv/vbaf304","DOIUrl":"10.1093/bioadv/vbaf304","url":null,"abstract":"<p><strong>Summary: </strong>Bioinformatics pipelines should meet the FAIR criteria to enable reproducible analysis. FAIR describes four key requirements for reproducible research: findability, accessibility, interoperability and reusability. Software containers such as Singularity are widely used tools that facilitate the reuse of software across different computing environments. However, many biologists and other researchers find command line tools such as Singularity unfamiliar and do not feel productive when using software via the command line. We present a graphical user interface that allows biologists without programming experience to interact with containerized software. We evaluate the feasibility of our approach with software used at the TRR156.</p><p><strong>Availability and implementation: </strong>Colony can be freely downloaded on its project page: https://clipc-jpg.github.io/ColonyWebsite/. The Colony launcher's code is MIT-licensed and freely available at: https://github.com/clipc-jpg/Colony. All related assets can be found at: https://doi.org/10.7910/DVN/Z3OTWY.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"6 1","pages":"vbaf304"},"PeriodicalIF":2.8,"publicationDate":"2025-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12776350/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145936594","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
tskit_arg_visualizer: interactive plotting of ancestral recombination graphs. Tskit_arg_visualizer:交互式绘制祖先重组图。
IF 2.8 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-11-24 eCollection Date: 2025-01-01 DOI: 10.1093/bioadv/vbaf302
James Kitchens, Yan Wong

Motivation: Ancestral recombination graphs (ARGs) are a complete representation of the genetic relationships between recombining lineages and are of central importance in population genetics. Recent breakthroughs in simulation and inference methods have led to a surge of interest in ARGs. However, understanding how best to take advantage of the graphical structure of ARGs remains an open question for researchers. Here, we introduce tskit_arg_visualizer, a Python package for programmatically drawing ARGs using the interactive D3.js visualization library.

Results: We highlight the usefulness of this visualization tool for both teaching ARG concepts and exploring ARGs inferred from empirical datasets.

Availability and implementation: The latest stable version of tskit_arg_visualizer is available through the Python Package Index (https://pypi.org/project/tskit-arg-visualizer, currently v0.1.1). Documentation and the development version of the package are found on GitHub (https://github.com/kitchensjn/tskit_arg_visualizer).

动机:祖先重组图(ARGs)是重组谱系之间遗传关系的完整表示,在群体遗传学中具有核心重要性。最近在模拟和推理方法上的突破导致了人们对arg的兴趣激增。然而,如何更好地利用arg的图像结构对研究人员来说仍然是一个悬而未决的问题。在这里,我们介绍tskit_arg_visualizer,这是一个Python包,用于使用交互式D3.js可视化库以编程方式绘制arg。结果:我们强调了这个可视化工具在教授ARG概念和探索从经验数据集推断的ARG方面的有用性。可用性和实现:tskit_arg_visualizer的最新稳定版本可通过Python包索引(https://pypi.org/project/tskit-arg-visualizer,当前v0.1.1)获得。该软件包的文档和开发版本可在GitHub (https://github.com/kitchensjn/tskit_arg_visualizer)上找到。
{"title":"tskit_arg_visualizer: interactive plotting of ancestral recombination graphs.","authors":"James Kitchens, Yan Wong","doi":"10.1093/bioadv/vbaf302","DOIUrl":"10.1093/bioadv/vbaf302","url":null,"abstract":"<p><strong>Motivation: </strong>Ancestral recombination graphs (ARGs) are a complete representation of the genetic relationships between recombining lineages and are of central importance in population genetics. Recent breakthroughs in simulation and inference methods have led to a surge of interest in ARGs. However, understanding how best to take advantage of the graphical structure of ARGs remains an open question for researchers. Here, we introduce tskit_arg_visualizer, a Python package for programmatically drawing ARGs using the interactive D3.js visualization library.</p><p><strong>Results: </strong>We highlight the usefulness of this visualization tool for both teaching ARG concepts and exploring ARGs inferred from empirical datasets.</p><p><strong>Availability and implementation: </strong>The latest stable version of tskit_arg_visualizer is available through the Python Package Index (https://pypi.org/project/tskit-arg-visualizer, currently v0.1.1). Documentation and the development version of the package are found on GitHub (https://github.com/kitchensjn/tskit_arg_visualizer).</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"5 1","pages":"vbaf302"},"PeriodicalIF":2.8,"publicationDate":"2025-11-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12701794/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145758496","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
AI-powered rapid detection of multidrug-resistant Klebsiella pneumoniae with informative peaks of MALDI-TOF MS. 人工智能驱动的多药耐药肺炎克雷伯菌快速检测与MALDI-TOF质谱信息峰。
IF 2.8 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-11-24 eCollection Date: 2026-01-01 DOI: 10.1093/bioadv/vbaf303
Jang-Jih Lu, Chia-Ru Chung, Hsin-Yao Wang, Yun Tang, Ming-Chien Chiang, Li-Ching Wu, Justin Bo-Kai Hsu, Tzong-Yi Lee, Jorng-Tzong Horng

Motivation: Klebsiella pneumoniae is a highly virulent superbug with rising antibiotic resistance worldwide. While matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS) has transformed microbial identification, its application to antimicrobial resistance prediction remains underexplored, particularly for large clinical cohorts. In this study, we developed machine-learning models with feature-level interpretability using MALDI-TOF MS data to rapidly predict resistance to ciprofloxacin (CIP), cefuroxime (CXM), and ceftriaxone (CRO) in K. pneumoniae.

Results: Using more than 28 000 isolates from two hospitals, the best-performing models reached an independent test accuracy of 0.7858, with sensitivity of 0.7289 and specificity of 0.8127. Several resistance-associated m/z signals-including 3657, 4341, 4519, 4709, 5070, 5409, 5921, 5939, and 6516-were consistently enriched in resistant isolates, offering interpretable spectral markers linked to resistance. Performance remained stable in time-based validation but declined across hospitals, suggesting sensitivity to geographic variability in resistance profiles. Overall, this study demonstrates that combining MALDI-TOF MS with machine learning enables rapid and interpretable prediction of resistance to commonly used fluoroquinolone and cephalosporins in K. pneumoniae. These findings highlight the clinical potential of such models for supporting empiric therapy and emphasize the importance of incorporating local data or adaptive strategies to improve generalizability across healthcare settings.

Availability and implementation: Data available on request from the authors.

动机:肺炎克雷伯菌是一种剧毒的超级细菌,在世界范围内具有越来越强的抗生素耐药性。虽然基质辅助激光解吸/电离飞行时间质谱(MALDI-TOF MS)已经改变了微生物鉴定,但其在抗菌药物耐药性预测中的应用仍未得到充分探索,特别是在大型临床队列中。在这项研究中,我们利用MALDI-TOF质谱数据开发了具有特征级可解释性的机器学习模型,以快速预测肺炎克雷布菌对环丙沙星(CIP)、头孢呋辛(CXM)和头孢曲松(CRO)的耐药性。结果:使用两家医院28000多株分离株,最佳模型独立检测准确率为0.7858,灵敏度为0.7289,特异度为0.8127。几种与抗性相关的m/z信号(包括3657、4341、4519、4709、5070、5409、5921、5939和6516)在抗性分离株中持续富集,提供了与抗性相关的可解释光谱标记。在基于时间的验证中,性能保持稳定,但在各个医院中有所下降,这表明对耐药概况的地理变异性敏感。总体而言,本研究表明,将MALDI-TOF质谱与机器学习相结合,可以快速且可解释地预测肺炎克雷伯菌对常用氟喹诺酮类药物和头孢菌素的耐药性。这些发现突出了这些模型在支持经验性治疗方面的临床潜力,并强调了纳入当地数据或适应性策略以提高医疗保健设置的普遍性的重要性。可获得性和实现:可根据作者的要求获得数据。
{"title":"AI-powered rapid detection of multidrug-resistant <i>Klebsiella pneumoniae</i> with informative peaks of MALDI-TOF MS.","authors":"Jang-Jih Lu, Chia-Ru Chung, Hsin-Yao Wang, Yun Tang, Ming-Chien Chiang, Li-Ching Wu, Justin Bo-Kai Hsu, Tzong-Yi Lee, Jorng-Tzong Horng","doi":"10.1093/bioadv/vbaf303","DOIUrl":"10.1093/bioadv/vbaf303","url":null,"abstract":"<p><strong>Motivation: </strong><i>Klebsiella pneumoniae</i> is a highly virulent superbug with rising antibiotic resistance worldwide. While matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS) has transformed microbial identification, its application to antimicrobial resistance prediction remains underexplored, particularly for large clinical cohorts. In this study, we developed machine-learning models with feature-level interpretability using MALDI-TOF MS data to rapidly predict resistance to ciprofloxacin (CIP), cefuroxime (CXM), and ceftriaxone (CRO) in <i>K. pneumoniae</i>.</p><p><strong>Results: </strong>Using more than 28 000 isolates from two hospitals, the best-performing models reached an independent test accuracy of 0.7858, with sensitivity of 0.7289 and specificity of 0.8127. Several resistance-associated <i>m/z</i> signals-including 3657, 4341, 4519, 4709, 5070, 5409, 5921, 5939, and 6516-were consistently enriched in resistant isolates, offering interpretable spectral markers linked to resistance. Performance remained stable in time-based validation but declined across hospitals, suggesting sensitivity to geographic variability in resistance profiles. Overall, this study demonstrates that combining MALDI-TOF MS with machine learning enables rapid and interpretable prediction of resistance to commonly used fluoroquinolone and cephalosporins in <i>K. pneumoniae</i>. These findings highlight the clinical potential of such models for supporting empiric therapy and emphasize the importance of incorporating local data or adaptive strategies to improve generalizability across healthcare settings.</p><p><strong>Availability and implementation: </strong>Data available on request from the authors.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"6 1","pages":"vbaf303"},"PeriodicalIF":2.8,"publicationDate":"2025-11-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12776345/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145936575","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Integrating differential privacy into federated multi-task learning algorithms in dsMTL. 将差分隐私集成到dsMTL的联邦多任务学习算法中。
IF 2.8 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-11-23 eCollection Date: 2025-01-01 DOI: 10.1093/bioadv/vbaf298
Roman Schefzik, Han Cao, Sivanesan Rajan, Xavier Escribà-Montagut, Juan R González, Emanuel Schwarz

Motivation: Multi-task learning (MTL) enables simultaneous learning of related regression or classification tasks by exploiting shared information. The R package dsMTL provides a computational framework for federated MTL approaches, supporting the analysis of sensitive, individual-level data from geographically distributed data sources using the DataSHIELD platform. While the current architecture provides comprehensive data security mechanisms, these are not specifically tailored to MTL models. In particular, these models may still be vulnerable to membership inference attacks, attempting to determine whether a specific individual was included in a given training set using the model.

Results: To further enhance the privacy-preserving capabilities of dsMTL and protect against such attacks, differential privacy using the Laplace mechanism is integrated into dsMTL as a novel optional feature. This approach aims to obscure individual-level characteristics from the model while retaining group-level differences. The differential privacy implementation is validated in both simulation studies and a case study identifying schizophrenia patients from gene expression data. For practical utility, it is crucial to find an adequate balance between the degree of privacy protection and the conservation of model performance by choosing a reasonable privacy parameter within the differential privacy mechanism.

Availability and implementation: dsMTL is open-source and available at https://github.com/transbioZI/dsMTLBase (server-side) and https://github.com/transbioZI/dsMTLClient (client-side).

动机:多任务学习(Multi-task learning, MTL)通过利用共享信息实现相关回归或分类任务的同时学习。R包dsMTL为联邦MTL方法提供了一个计算框架,支持使用DataSHIELD平台分析来自地理分布数据源的敏感的、个人级别的数据。虽然当前的体系结构提供了全面的数据安全机制,但这些机制并不是专门为MTL模型量身定制的。特别是,这些模型可能仍然容易受到成员推理攻击,试图使用模型确定特定个体是否包含在给定的训练集中。结果:为了进一步增强dsMTL的隐私保护能力并防范此类攻击,将使用拉普拉斯机制的差分隐私作为一种新的可选特性集成到dsMTL中。这种方法旨在从模型中模糊个人层面的特征,同时保留群体层面的差异。在模拟研究和从基因表达数据中识别精神分裂症患者的案例研究中,差异隐私实现得到了验证。在差分隐私机制中选择合理的隐私参数,在隐私保护程度和模型性能守恒之间找到适当的平衡,对于实际应用至关重要。可用性和实现:dsMTL是开源的,可以在https://github.com/transbioZI/dsMTLBase(服务器端)和https://github.com/transbioZI/dsMTLClient(客户端)上获得。
{"title":"Integrating differential privacy into federated multi-task learning algorithms in <b>dsMTL</b>.","authors":"Roman Schefzik, Han Cao, Sivanesan Rajan, Xavier Escribà-Montagut, Juan R González, Emanuel Schwarz","doi":"10.1093/bioadv/vbaf298","DOIUrl":"10.1093/bioadv/vbaf298","url":null,"abstract":"<p><strong>Motivation: </strong>Multi-task learning (MTL) enables simultaneous learning of related regression or classification tasks by exploiting shared information. The R package dsMTL provides a computational framework for federated MTL approaches, supporting the analysis of sensitive, individual-level data from geographically distributed data sources using the DataSHIELD platform. While the current architecture provides comprehensive data security mechanisms, these are not specifically tailored to MTL models. In particular, these models may still be vulnerable to membership inference attacks, attempting to determine whether a specific individual was included in a given training set using the model.</p><p><strong>Results: </strong>To further enhance the privacy-preserving capabilities of dsMTL and protect against such attacks, differential privacy using the Laplace mechanism is integrated into dsMTL as a novel optional feature. This approach aims to obscure individual-level characteristics from the model while retaining group-level differences. The differential privacy implementation is validated in both simulation studies and a case study identifying schizophrenia patients from gene expression data. For practical utility, it is crucial to find an adequate balance between the degree of privacy protection and the conservation of model performance by choosing a reasonable privacy parameter within the differential privacy mechanism.</p><p><strong>Availability and implementation: </strong>dsMTL is open-source and available at https://github.com/transbioZI/dsMTLBase (server-side) and https://github.com/transbioZI/dsMTLClient (client-side).</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"5 1","pages":"vbaf298"},"PeriodicalIF":2.8,"publicationDate":"2025-11-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12701803/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145758425","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
geomeTriD: a Bioconductor package for interactive and integrative visualization of 3D structural model with multi-omics data. 一个生物导体包,用于交互式和集成可视化的三维结构模型与多组学数据。
IF 2.8 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-11-23 eCollection Date: 2025-01-01 DOI: 10.1093/bioadv/vbaf299
Jianhong Ou, Kenneth D Poss

Motivation: The three-dimensional organization of the genome plays a critical role in regulating gene expression by shaping the spatial and temporal interactions between regulatory elements. High-throughput chromosome conformation capture (Hi-C) technologies, along with immunoprecipitation- or chromatin accessibility-based chromatin architecture mapping methods, enable the measurement of chromatin dynamics at both bulk and single-cell levels. However, effectively exploring and comparing chromatin structures remains challenging, particularly when integrating multiple layers of genomic annotation or comparing structural dynamics across conditions. While several tools support interactive 3D genome visualization, few provide a flexible, R-integrated framework that supports custom annotations, side-by-side comparison of multiple stages or conditions, and deployment in Shiny applications.

Results: To address this need, we have developed geomeTriD, an R/Bioconductor package that enables interactive visualization of chromatin structures using three.js, supports multi-layer annotation, allows parallel comparison of two chromatin states, and is compatible with Shiny-based analysis workflows. As multi-omic and spatial genomic datasets grow in complexity, GeomeTriD will facilitate the reconstruction and comparison of 3D genome structures across conditions, linking chromatin architecture to gene regulation, epigenetic states, and cell-state transitions.

Availability and implementation: geomeTriD is freely available at https://bioconductor.org/packages/geomeTriD.

动机:基因组的三维组织通过塑造调控元件之间的时空相互作用,在调控基因表达中起着至关重要的作用。高通量染色体构象捕获(Hi-C)技术,以及基于免疫沉淀或染色质可及性的染色质结构制图方法,能够在整体和单细胞水平上测量染色质动力学。然而,有效地探索和比较染色质结构仍然具有挑战性,特别是在整合多层基因组注释或比较不同条件下的结构动态时。虽然有一些工具支持交互式3D基因组可视化,但很少有工具提供灵活的r集成框架,支持自定义注释、多个阶段或条件的并行比较,以及在Shiny应用程序中部署。结果:为了满足这一需求,我们开发了一个R/Bioconductor包,它可以使用three.js实现染色质结构的交互式可视化,支持多层注释,允许两种染色质状态的并行比较,并且与基于shine的分析工作流程兼容。随着多组学和空间基因组数据集的日益复杂,geomeid将促进不同条件下三维基因组结构的重建和比较,将染色质结构与基因调控、表观遗传状态和细胞状态转换联系起来。可用性和实现:在https://bioconductor.org/packages/geomeTriD上可以免费获得geomeTriD。
{"title":"geomeTriD: a Bioconductor package for interactive and integrative visualization of 3D structural model with multi-omics data.","authors":"Jianhong Ou, Kenneth D Poss","doi":"10.1093/bioadv/vbaf299","DOIUrl":"10.1093/bioadv/vbaf299","url":null,"abstract":"<p><strong>Motivation: </strong>The three-dimensional organization of the genome plays a critical role in regulating gene expression by shaping the spatial and temporal interactions between regulatory elements. High-throughput chromosome conformation capture (Hi-C) technologies, along with immunoprecipitation- or chromatin accessibility-based chromatin architecture mapping methods, enable the measurement of chromatin dynamics at both bulk and single-cell levels. However, effectively exploring and comparing chromatin structures remains challenging, particularly when integrating multiple layers of genomic annotation or comparing structural dynamics across conditions. While several tools support interactive 3D genome visualization, few provide a flexible, R-integrated framework that supports custom annotations, side-by-side comparison of multiple stages or conditions, and deployment in Shiny applications.</p><p><strong>Results: </strong>To address this need, we have developed geomeTriD, an R/Bioconductor package that enables interactive visualization of chromatin structures using three.js, supports multi-layer annotation, allows parallel comparison of two chromatin states, and is compatible with Shiny-based analysis workflows. As multi-omic and spatial genomic datasets grow in complexity, GeomeTriD will facilitate the reconstruction and comparison of 3D genome structures across conditions, linking chromatin architecture to gene regulation, epigenetic states, and cell-state transitions.</p><p><strong>Availability and implementation: </strong>geomeTriD is freely available at https://bioconductor.org/packages/geomeTriD.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"5 1","pages":"vbaf299"},"PeriodicalIF":2.8,"publicationDate":"2025-11-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12702139/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145764171","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Performance assessment of phylogenetic inference tools using PhyloSmew. 基于PhyloSmew的系统发育推断工具的性能评估。
IF 2.8 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-11-23 eCollection Date: 2025-01-01 DOI: 10.1093/bioadv/vbaf300
Dimitri Höhler, Julia Haag, Alexey M Kozlov, Benoit Morel, Alexandros Stamatakis

Motivation: The performance of phylogenetic inference tools is commonly evaluated using simulated as well as empirical sequence data alignments. An open question is how representative these alignments are with respect to those, commonly analyzed by users. Using the RAxMLGrove database, it is now possible to simulate DNA and amino acid sequences based on more than 70 000 representative RAxML and RAxML-NG tree inferences on empirical datasets conducted on the RAxML web servers. This allows to assess the phylogenetic tree inference accuracy of various inference tools based on more realistic and representative simulated alignments.

Results: To automate this process, we implement PhyloSmew, a tool for benchmarking phylogenetic inference tools. We use it to simulate ∼20 000 multiple sequence alignments (MSAs) based on representative empirical trees (in terms of signal strength) from RAxMLGrove. We subsequently analyze 5000 empirical MSAs from the TreeBASE database, to assess the inference accuracy of FastTree2, IQ-TREE2, and RAxML-NG. We find that on quantifiably difficult-to-analyze MSAs, all three tree inference tools perform poorly. Hence, the faster FastTree2 tool, constitutes a viable alternative to infer trees on difficult MSAs. We also find that there are substantial differences between accuracy results on simulated versus empirical data.

Availability and implementation: The data underlying this article are available at https://github.com/angtft/PhyloSmew, https://cme.h-its.org/exelixis/material/accuracy-study/data.tar.gz.

动机:系统发育推断工具的性能通常使用模拟和经验序列数据比对来评估。一个悬而未决的问题是,相对于那些通常由用户分析的排列,这些排列的代表性如何。使用RAxMLGrove数据库,现在可以根据在RAxML web服务器上进行的经验数据集上的超过70,000个代表性RAxML和RAxML- ng树推断来模拟DNA和氨基酸序列。这允许评估基于更现实和代表性的模拟比对的各种推理工具的系统发育树推理精度。结果:为了使这一过程自动化,我们实现了PhyloSmew,这是一个对系统发育推断工具进行基准测试的工具。我们使用它来模拟基于来自RAxMLGrove的代表性经验树(就信号强度而言)的~ 20,000多个序列比对(msa)。随后,我们分析了来自TreeBASE数据库的5000个经验msa,以评估fasttre2、IQ-TREE2和RAxML-NG的推理精度。我们发现,在难以量化分析的msa上,所有三种树推理工具都表现不佳。因此,更快的fasttre2工具构成了在困难的msa上推断树的可行替代方案。我们还发现,在模拟数据和经验数据的精度结果之间存在实质性差异。可用性和实现:本文的基础数据可从https://github.com/angtft/PhyloSmew和https://cme.h-its.org/exelixis/material/accuracy-study/data.tar.gz获得。
{"title":"Performance assessment of phylogenetic inference tools using PhyloSmew.","authors":"Dimitri Höhler, Julia Haag, Alexey M Kozlov, Benoit Morel, Alexandros Stamatakis","doi":"10.1093/bioadv/vbaf300","DOIUrl":"10.1093/bioadv/vbaf300","url":null,"abstract":"<p><strong>Motivation: </strong>The performance of phylogenetic inference tools is commonly evaluated using simulated as well as empirical sequence data alignments. An open question is how representative these alignments are with respect to those, commonly analyzed by users. Using the RAxMLGrove database, it is now possible to simulate DNA and amino acid sequences based on more than 70 000 representative RAxML and RAxML-NG tree inferences on empirical datasets conducted on the RAxML web servers. This allows to assess the phylogenetic tree inference accuracy of various inference tools based on more realistic and representative simulated alignments.</p><p><strong>Results: </strong>To automate this process, we implement PhyloSmew, a tool for benchmarking phylogenetic inference tools. We use it to simulate ∼20 000 multiple sequence alignments (MSAs) based on representative empirical trees (in terms of signal strength) from RAxMLGrove. We subsequently analyze 5000 empirical MSAs from the TreeBASE database, to assess the inference accuracy of FastTree2, IQ-TREE2, and RAxML-NG. We find that on quantifiably difficult-to-analyze MSAs, all three tree inference tools perform poorly. Hence, the faster FastTree2 tool, constitutes a viable alternative to infer trees on difficult MSAs. We also find that there are substantial differences between accuracy results on simulated versus empirical data.</p><p><strong>Availability and implementation: </strong>The data underlying this article are available at https://github.com/angtft/PhyloSmew, https://cme.h-its.org/exelixis/material/accuracy-study/data.tar.gz.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"5 1","pages":"vbaf300"},"PeriodicalIF":2.8,"publicationDate":"2025-11-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12701799/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145758434","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Bioinformatics advances
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1