首页 > 最新文献

Bioinformatics advances最新文献

英文 中文
motifbreakR v2: expanded variant analysis including indels and integrated evidence from transcription factor binding databases. motifbreakR v2:扩展的变异分析,包括嵌合和来自转录因子结合数据库的综合证据。
IF 2.4 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-10-23 eCollection Date: 2024-01-01 DOI: 10.1093/bioadv/vbae162
Simon G Coetzee, Dennis J Hazelett

Motivation: motifbreakR scans genetic variants against position weight matrices of transcription factors (TFs) to determine the potential for the disruption of binding at the site of the variant. It leverages the Bioconductor suite of software packages and annotations to query a diverse array of genomes and motif databases. Initially developed to interrogate the effect of single-nucleotide variants on TF binding sites, in motifbreakR v2, we have updated the functionality.

Results: New features include the ability to query other types of complex genetic variants, such as short insertions and deletions. This capability allows modeling a more extensive array of variants that may have significant effects on TF binding. Additionally, predictions based on sequence preference alone can indicate many more potential binding events than observed. Adding information from DNA-binding sequencing datasets lends confidence to motif disruption prediction by demonstrating TF binding in cell lines and tissue types. Therefore, motifbreakR can directly query the ReMap2022 database for evidence that a TF matching the disrupted motif binds over the disrupting variant. Finally, in motifbreakR, in addition to the existing interface, we implemented an R/Shiny graphical user interface to simplify and enhance access to researchers with different skill sets.

Availability and implementation: motifbreakR is implemented in R. Source code, documentation, and tutorials are available on Bioconductor at https://bioconductor.org/packages/release/bioc/html/motifbreakR.html and GitHub at https://github.com/Simon-Coetzee/motifBreakR.

动机:motifbreakR 可根据转录因子 (TF) 的位置权重矩阵扫描遗传变异,以确定在变异位点破坏结合的可能性。它利用 Bioconductor 软件包和注释来查询各种基因组和主题数据库。在 motifbreakR v2 中,我们更新了其功能:新功能包括能够查询其他类型的复杂遗传变异,如短插入和短缺失。这一功能允许对可能对 TF 结合产生重大影响的变异进行更广泛的建模。此外,仅根据序列偏好进行预测可能会显示出比观察到的更多的潜在结合事件。通过展示细胞系和组织类型中的 TF 结合情况,从 DNA 结合测序数据集中添加信息可增强对图案破坏预测的信心。因此,motifbreakR 可以直接查询 ReMap2022 数据库,以获得与中断基调匹配的 TF 与中断变体结合的证据。最后,在 motifbreakR 中,除了现有的界面外,我们还实现了一个 R/Shiny 图形用户界面,以简化和提高具有不同技能组合的研究人员的访问能力。源代码、文档和教程可在 Bioconductor https://bioconductor.org/packages/release/bioc/html/motifbreakR.html 和 GitHub https://github.com/Simon-Coetzee/motifBreakR 上获取。
{"title":"<i>motifbreakR</i> v2: expanded variant analysis including indels and integrated evidence from transcription factor binding databases.","authors":"Simon G Coetzee, Dennis J Hazelett","doi":"10.1093/bioadv/vbae162","DOIUrl":"https://doi.org/10.1093/bioadv/vbae162","url":null,"abstract":"<p><strong>Motivation: </strong><i>motifbreakR</i> scans genetic variants against position weight matrices of transcription factors (TFs) to determine the potential for the disruption of binding at the site of the variant. It leverages the Bioconductor suite of software packages and annotations to query a diverse array of genomes and motif databases. Initially developed to interrogate the effect of single-nucleotide variants on TF binding sites, in <i>motifbreakR</i> v2, we have updated the functionality.</p><p><strong>Results: </strong>New features include the ability to query other types of complex genetic variants, such as short insertions and deletions. This capability allows modeling a more extensive array of variants that may have significant effects on TF binding. Additionally, predictions based on sequence preference alone can indicate many more potential binding events than observed. Adding information from DNA-binding sequencing datasets lends confidence to motif disruption prediction by demonstrating TF binding in cell lines and tissue types. Therefore, <i>motifbreakR can directly query</i> the ReMap2022 database for evidence that a TF matching the disrupted motif binds over the disrupting variant. Finally, in <i>motifbreakR</i>, in addition to the existing interface, we implemented an R/Shiny graphical user interface to simplify and enhance access to researchers with different skill sets.</p><p><strong>Availability and implementation: </strong><i>motifbreakR</i> is implemented in R. Source code, documentation, and tutorials are available on Bioconductor at https://bioconductor.org/packages/release/bioc/html/motifbreakR.html and GitHub at https://github.com/Simon-Coetzee/motifBreakR.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":null,"pages":null},"PeriodicalIF":2.4,"publicationDate":"2024-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11520234/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142549260","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
TransAnnot-a fast transcriptome annotation pipeline. TransAnnot--快速转录组注释管道。
IF 2.4 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-10-22 eCollection Date: 2024-01-01 DOI: 10.1093/bioadv/vbae152
Mariia Zelenskaia, Yazhini Arangasamy, Milot Mirdita, Johannes Söding, Venket Raghavan

Summary: The annotation of deeply sequenced, de novo assembled transcriptomes continues to be a challenge as some of the state-of-the-art tools are slow, difficult to install, and hard to use. We have tackled these issues with TransAnnot, a fast, automated transcriptome annotation pipeline that is easy to install and use. Leveraging the fast sequence searches provided by the MMseqs2 suite, TransAnnot offers one-step annotation of homologs from Swiss-Prot, gene ontology terms and orthogroups from eggNOG, and functional domains from Pfam. Users also have the option to annotate against custom databases. TransAnnot accepts sequencing reads (short and long), nucleotide sequences, or amino acid sequences as input for annotation. When benchmarked with test data sets of amino acid sequences, TransAnnot was 333, 284, and 18 times faster than comparable tools such as EnTAP, Trinotate, and eggNOG-mapper respectively.

Availability and implementation: TransAnnot is free to use, open sourced under GPLv3, and is implemented in C++ and Bash. Source code, documentation, and pre-compiled binaries are available at https://github.com/soedinglab/transannot. TransAnnot is also available via bioconda (https://anaconda.org/bioconda/transannot).

摘要:对深度测序、从头组装的转录组进行注释仍然是一项挑战,因为一些最先进的工具速度慢、安装困难、难以使用。我们利用 TransAnnot 解决了这些问题,它是一种易于安装和使用的快速自动转录组注释管道。利用 MMseqs2 套件提供的快速序列搜索,TransAnnot 可以一步注释 Swiss-Prot、eggNOG 中的基因本体术语和正交群,以及 Pfam 中的功能域。用户还可以根据自定义数据库进行注释。TransAnnot 接受测序读数(长短)、核苷酸序列或氨基酸序列作为注释输入。在使用氨基酸序列测试数据集进行基准测试时,TransAnnot 的速度分别是 EnTAP、Trinotate 和 eggNOG-mapper 等同类工具的 333 倍、284 倍和 18 倍:TransAnnot 可免费使用,根据 GPLv3 开放源码,用 C++ 和 Bash 实现。源代码、文档和预编译二进制文件可从 https://github.com/soedinglab/transannot 获取。TransAnnot 也可通过 bioconda (https://anaconda.org/bioconda/transannot) 获取。
{"title":"TransAnnot-a fast transcriptome annotation pipeline.","authors":"Mariia Zelenskaia, Yazhini Arangasamy, Milot Mirdita, Johannes Söding, Venket Raghavan","doi":"10.1093/bioadv/vbae152","DOIUrl":"10.1093/bioadv/vbae152","url":null,"abstract":"<p><strong>Summary: </strong>The annotation of deeply sequenced, <i>de novo</i> assembled transcriptomes continues to be a challenge as some of the state-of-the-art tools are slow, difficult to install, and hard to use. We have tackled these issues with TransAnnot, a fast, automated transcriptome annotation pipeline that is easy to install and use. Leveraging the fast sequence searches provided by the MMseqs2 suite, TransAnnot offers one-step annotation of homologs from Swiss-Prot, gene ontology terms and orthogroups from eggNOG, and functional domains from Pfam. Users also have the option to annotate against custom databases. TransAnnot accepts sequencing reads (short and long), nucleotide sequences, or amino acid sequences as input for annotation. When benchmarked with test data sets of amino acid sequences, TransAnnot was 333, 284, and 18 times faster than comparable tools such as EnTAP, Trinotate, and eggNOG-mapper respectively.</p><p><strong>Availability and implementation: </strong>TransAnnot is free to use, open sourced under GPLv3, and is implemented in C++ and Bash. Source code, documentation, and pre-compiled binaries are available at https://github.com/soedinglab/transannot. TransAnnot is also available via bioconda (https://anaconda.org/bioconda/transannot).</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":null,"pages":null},"PeriodicalIF":2.4,"publicationDate":"2024-10-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11530227/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142570211","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
PatchProt: hydrophobic patch prediction using protein foundation models. PatchProt:利用蛋白质基础模型预测疏水斑块。
IF 2.4 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-10-14 eCollection Date: 2024-01-01 DOI: 10.1093/bioadv/vbae154
Dea Gogishvili, Emmanuel Minois-Genin, Jan van Eck, Sanne Abeln

Motivation: Hydrophobic patches on protein surfaces play important functional roles in protein-protein and protein-ligand interactions. Large hydrophobic surfaces are also involved in the progression of aggregation diseases. Predicting exposed hydrophobic patches from a protein sequence has shown to be a difficult task. Fine-tuning foundation models allows for adapting a model to the specific nuances of a new task using a much smaller dataset. Additionally, multitask deep learning offers a promising solution for addressing data gaps, simultaneously outperforming single-task methods.

Results: In this study, we harnessed a recently released leading large language model Evolutionary Scale Models (ESM-2). Efficient fine-tuning of ESM-2 was achieved by leveraging a recently developed parameter-efficient fine-tuning method. This approach enabled comprehensive training of model layers without excessive parameters and without the need to include a computationally expensive multiple sequence analysis. We explored several related tasks, at local (residue) and global (protein) levels, to improve the representation of the model. As a result, our model, PatchProt, cannot only predict hydrophobic patch areas but also outperforms existing methods at predicting primary tasks, including secondary structure and surface accessibility predictions. Importantly, our analysis shows that including related local tasks can improve predictions on more difficult global tasks. This research sets a new standard for sequence-based protein property prediction and highlights the remarkable potential of fine-tuning foundation models enriching the model representation by training over related tasks.

Availability and implementation: https://github.com/Deagogishvili/chapter-multi-task.

动机蛋白质表面的疏水斑块在蛋白质-蛋白质和蛋白质-配体相互作用中发挥着重要的功能作用。大面积的疏水表面也与聚集性疾病的发展有关。根据蛋白质序列预测暴露的疏水斑块是一项艰巨的任务。通过对基础模型进行微调,可以使用更小的数据集使模型适应新任务的具体细微差别。此外,多任务深度学习为解决数据缺口问题提供了一种前景广阔的解决方案,同时还优于单任务方法:在这项研究中,我们利用了最近发布的领先大型语言模型 Evolutionary Scale Models(ESM-2)。通过利用最近开发的参数高效微调方法,实现了对 ESM-2 的高效微调。这种方法能够对模型层进行全面训练,无需过多参数,也无需进行计算成本高昂的多序列分析。我们在局部(残基)和全局(蛋白质)层面探索了几项相关任务,以改进模型的表示。因此,我们的模型 PatchProt 不仅能预测疏水斑块区域,而且在预测二级结构和表面可及性预测等主要任务方面也优于现有方法。重要的是,我们的分析表明,包含相关的局部任务可以改善对更困难的全局任务的预测。这项研究为基于序列的蛋白质性质预测设定了一个新标准,并凸显了通过对相关任务进行训练来丰富模型表征的微调基础模型的巨大潜力。可用性与实现:https://github.com/Deagogishvili/chapter-multi-task。
{"title":"PatchProt: hydrophobic patch prediction using protein foundation models.","authors":"Dea Gogishvili, Emmanuel Minois-Genin, Jan van Eck, Sanne Abeln","doi":"10.1093/bioadv/vbae154","DOIUrl":"10.1093/bioadv/vbae154","url":null,"abstract":"<p><strong>Motivation: </strong>Hydrophobic patches on protein surfaces play important functional roles in protein-protein and protein-ligand interactions. Large hydrophobic surfaces are also involved in the progression of aggregation diseases. Predicting exposed hydrophobic patches from a protein sequence has shown to be a difficult task. Fine-tuning foundation models allows for adapting a model to the specific nuances of a new task using a much smaller dataset. Additionally, multitask deep learning offers a promising solution for addressing data gaps, simultaneously outperforming single-task methods.</p><p><strong>Results: </strong>In this study, we harnessed a recently released leading large language model Evolutionary Scale Models (ESM-2). Efficient fine-tuning of ESM-2 was achieved by leveraging a recently developed parameter-efficient fine-tuning method. This approach enabled comprehensive training of model layers without excessive parameters and without the need to include a computationally expensive multiple sequence analysis. We explored several related tasks, at local (residue) and global (protein) levels, to improve the representation of the model. As a result, our model, PatchProt, cannot only predict hydrophobic patch areas but also outperforms existing methods at predicting primary tasks, including secondary structure and surface accessibility predictions. Importantly, our analysis shows that including related local tasks can improve predictions on more difficult global tasks. This research sets a new standard for sequence-based protein property prediction and highlights the remarkable potential of fine-tuning foundation models enriching the model representation by training over related tasks.</p><p><strong>Availability and implementation: </strong>https://github.com/Deagogishvili/chapter-multi-task.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":null,"pages":null},"PeriodicalIF":2.4,"publicationDate":"2024-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11525051/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142559614","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Accelerating protein-protein interaction screens with reduced AlphaFold-Multimer sampling. 利用减少的 AlphaFold-Multimer 采样加速蛋白质-蛋白质相互作用筛选。
IF 2.4 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-10-11 eCollection Date: 2024-01-01 DOI: 10.1093/bioadv/vbae153
Greta Bellinzona, Davide Sassera, Alexandre M J J Bonvin

Motivation: Discovering new protein-protein interactions (PPIs) across entire proteomes offers vast potential for understanding novel protein functions and elucidate system properties within or between an organism. While recent advances in computational structural biology, particularly AlphaFold-Multimer, have facilitated this task, scaling for large-scale screenings remains a challenge, requiring significant computational resources.

Results: We evaluated the impact of reducing the number of models generated by AlphaFold-Multimer from five to one on the method's ability to distinguish true PPIs from false ones. Our evaluation was conducted on a dataset containing both intra- and inter-species PPIs, which included proteins from bacterial and eukaryotic sources. We demonstrate that reducing the sampling does not compromise the accuracy of the method, offering a faster, efficient, and environmentally friendly solution for PPI predictions.

Availability and implementation: The code used in this article is available at https://github.com/MIDIfactory/AlphaFastPPi. Note that the same can be achieved using the latest version of AlphaPulldown available at https://github.com/KosinskiLab/AlphaPulldown.

动机在整个蛋白质组中发现新的蛋白质-蛋白质相互作用(PPIs)为了解新的蛋白质功能和阐明生物体内或生物体之间的系统特性提供了巨大的潜力。虽然计算结构生物学(尤其是 AlphaFold-Multimer)的最新进展促进了这项任务的完成,但大规模筛选的扩展仍是一项挑战,需要大量的计算资源:我们评估了将 AlphaFold-Multimer 生成的模型数量从五个减少到一个对该方法区分真假 PPI 的能力的影响。我们的评估是在一个包含种内和种间 PPI 的数据集上进行的,其中包括来自细菌和真核生物的蛋白质。我们证明,减少采样并不会影响该方法的准确性,从而为 PPI 预测提供了一种更快、更高效、更环保的解决方案:本文使用的代码可从 https://github.com/MIDIfactory/AlphaFastPPi 网站获取。请注意,使用 https://github.com/KosinskiLab/AlphaPulldown 上最新版本的 AlphaPulldown 也能实现同样的效果。
{"title":"Accelerating protein-protein interaction screens with reduced AlphaFold-Multimer sampling.","authors":"Greta Bellinzona, Davide Sassera, Alexandre M J J Bonvin","doi":"10.1093/bioadv/vbae153","DOIUrl":"10.1093/bioadv/vbae153","url":null,"abstract":"<p><strong>Motivation: </strong>Discovering new protein-protein interactions (PPIs) across entire proteomes offers vast potential for understanding novel protein functions and elucidate system properties within or between an organism. While recent advances in computational structural biology, particularly AlphaFold-Multimer, have facilitated this task, scaling for large-scale screenings remains a challenge, requiring significant computational resources.</p><p><strong>Results: </strong>We evaluated the impact of reducing the number of models generated by AlphaFold-Multimer from five to one on the method's ability to distinguish true PPIs from false ones. Our evaluation was conducted on a dataset containing both intra- and inter-species PPIs, which included proteins from bacterial and eukaryotic sources. We demonstrate that reducing the sampling does not compromise the accuracy of the method, offering a faster, efficient, and environmentally friendly solution for PPI predictions.</p><p><strong>Availability and implementation: </strong>The code used in this article is available at https://github.com/MIDIfactory/AlphaFastPPi. Note that the same can be achieved using the latest version of AlphaPulldown available at https://github.com/KosinskiLab/AlphaPulldown.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":null,"pages":null},"PeriodicalIF":2.4,"publicationDate":"2024-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11513016/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142513907","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CAPTVRED: an automated pipeline for viral tracking and discovery from capture-based metagenomics samples. CAPTVRED:从基于捕获的元基因组学样本中自动追踪和发现病毒的管道。
IF 2.4 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-10-08 eCollection Date: 2024-01-01 DOI: 10.1093/bioadv/vbae150
Maria Tarradas-Alemany, Sandra Martínez-Puchol, Cristina Mejías-Molina, Marta Itarte, Marta Rusiñol, Sílvia Bofill-Mas, Josep F Abril

Summary: Target Enrichment Sequencing or Capture-based metagenomics has emerged as an approach of interest for viral metagenomics in complex samples. However, these datasets are usually analyzed with standard downstream Bioinformatics analyses. CAPTVRED (Capture-based metagenomics Analysis Pipeline for tracking ViRal species from Environmental Datasets), has been designed to assess the virome present in complex samples, specially focused on those obtained by Target Enrichment Sequencing approach. This work aims to provide a user-friendly tool that complements this sequencing approach for the total or partial virome description, especially from environmental matrices. It includes a setup module which allows preparation and adjustment of the pipeline to any capture panel directed to a set of species of interest. The tool also aims to reduce time and computational cost, as well as to provide comprehensive, reproducible, and accessible results while being easy to costume, set up, and install.

Availability and implementation: Source code and test datasets are freely available at github repository: https://github.com/CompGenLabUB/CAPTVRED.git.

摘要:靶标富集测序或基于捕获的元基因组学已成为复杂样本中病毒元基因组学的一种有效方法。然而,这些数据集通常都要进行标准的下游生物信息学分析。CAPTVRED(基于捕获的元基因组学分析管道,用于从环境数据集中追踪病毒物种)旨在评估复杂样本中存在的病毒群,尤其侧重于通过目标富集测序方法获得的样本。这项工作旨在提供一种用户友好型工具,对这种测序方法进行补充,以描述全部或部分病毒群,尤其是环境基质中的病毒群。该工具包括一个设置模块,可针对一组感兴趣的物种准备和调整管道,以适应任何捕获面板。该工具还旨在减少时间和计算成本,并提供全面、可重复和可访问的结果,同时易于安装、设置和安装:源代码和测试数据集可在 github 存储库中免费获取:https://github.com/CompGenLabUB/CAPTVRED.git。
{"title":"CAPTVRED: an automated pipeline for viral tracking and discovery from capture-based metagenomics samples.","authors":"Maria Tarradas-Alemany, Sandra Martínez-Puchol, Cristina Mejías-Molina, Marta Itarte, Marta Rusiñol, Sílvia Bofill-Mas, Josep F Abril","doi":"10.1093/bioadv/vbae150","DOIUrl":"https://doi.org/10.1093/bioadv/vbae150","url":null,"abstract":"<p><strong>Summary: </strong>Target Enrichment Sequencing or Capture-based metagenomics has emerged as an approach of interest for viral metagenomics in complex samples. However, these datasets are usually analyzed with standard downstream Bioinformatics analyses. CAPTVRED (<i>Capture-based metagenomics Analysis Pipeline for tracking ViRal species from Environmental Datasets</i>), has been designed to assess the virome present in complex samples, specially focused on those obtained by Target Enrichment Sequencing approach. This work aims to provide a user-friendly tool that complements this sequencing approach for the total or partial virome description, especially from environmental matrices. It includes a setup module which allows preparation and adjustment of the pipeline to any capture panel directed to a set of species of interest. The tool also aims to reduce time and computational cost, as well as to provide comprehensive, reproducible, and accessible results while being easy to costume, set up, and install.</p><p><strong>Availability and implementation: </strong>Source code and test datasets are freely available at github repository: https://github.com/CompGenLabUB/CAPTVRED.git.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":null,"pages":null},"PeriodicalIF":2.4,"publicationDate":"2024-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11495672/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142513908","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DynProfiler: a Python package for comprehensive analysis and interpretation of signaling dynamics leveraged by deep learning techniques. DynProfiler:利用深度学习技术对信号动态进行综合分析和解释的 Python 软件包。
IF 2.4 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-10-07 eCollection Date: 2024-01-01 DOI: 10.1093/bioadv/vbae145
Masato Tsutsui, Mariko Okada

Summary: Signaling dynamics encode important features and regulatory mechanisms of biological systems, and recent studies have reported the use of simulated signaling dynamics with mechanistic modeling as biomarkers for human diseases. Given the success of deep learning techniques, it is expected that they can extract informative patterns from simulation results more effectively than traditional approaches involving manual feature selection, which can be used for subsequent analyses, such as patient stratification and survival prediction. Here, we propose DynProfiler, which utilizes the entire signaling dynamics, including intermediate variables, as input and leverages deep learning techniques to extract informative features without requiring any labels. Furthermore, DynProfiler incorporates a modern explainable AI solution to provide quantitative time-dependent importance scores for each dynamics. Using simulated dynamics of patients with breast cancer as an example, we demonstrate DynProfiler's ability to extract high-quality features that can predict mortality risk and identify important dynamics, highlighting upregulated phosphorylated GSK3β as a biomarker for poor prognosis. Overall, this tool can be useful for clinical application, as well as for elucidating biological system dynamics.

Availability and implementation: The DynProfiler Python library is available in GitHub at https://github.com/okadalabipr/DynProfiler.

摘要:信号动力学编码了生物系统的重要特征和调控机制,最近的研究报道了利用模拟信号动力学机理模型作为人类疾病的生物标志物。鉴于深度学习技术的成功,与传统的人工特征选择方法相比,深度学习技术有望更有效地从模拟结果中提取信息模式,并将其用于患者分层和生存预测等后续分析。在此,我们提出了 DynProfiler,它利用包括中间变量在内的整个信号动态作为输入,并利用深度学习技术提取信息特征,而无需任何标签。此外,DynProfiler 还采用了现代可解释人工智能解决方案,为每个动力学提供量化的随时间变化的重要性评分。以乳腺癌患者的模拟动态为例,我们展示了 DynProfiler 提取高质量特征的能力,这些特征可以预测死亡风险并识别重要动态,突出显示上调的磷酸化 GSK3β 是不良预后的生物标志物。总之,该工具可用于临床应用以及阐明生物系统动力学:DynProfiler Python 库可在 GitHub 上获取:https://github.com/okadalabipr/DynProfiler。
{"title":"DynProfiler: a Python package for comprehensive analysis and interpretation of signaling dynamics leveraged by deep learning techniques.","authors":"Masato Tsutsui, Mariko Okada","doi":"10.1093/bioadv/vbae145","DOIUrl":"10.1093/bioadv/vbae145","url":null,"abstract":"<p><strong>Summary: </strong>Signaling dynamics encode important features and regulatory mechanisms of biological systems, and recent studies have reported the use of simulated signaling dynamics with mechanistic modeling as biomarkers for human diseases. Given the success of deep learning techniques, it is expected that they can extract informative patterns from simulation results more effectively than traditional approaches involving manual feature selection, which can be used for subsequent analyses, such as patient stratification and survival prediction. Here, we propose DynProfiler, which utilizes the entire signaling dynamics, including intermediate variables, as input and leverages deep learning techniques to extract informative features without requiring any labels. Furthermore, DynProfiler incorporates a modern explainable AI solution to provide quantitative time-dependent importance scores for each dynamics. Using simulated dynamics of patients with breast cancer as an example, we demonstrate DynProfiler's ability to extract high-quality features that can predict mortality risk and identify important dynamics, highlighting upregulated phosphorylated GSK3β as a biomarker for poor prognosis. Overall, this tool can be useful for clinical application, as well as for elucidating biological system dynamics.</p><p><strong>Availability and implementation: </strong>The DynProfiler Python library is available in GitHub at https://github.com/okadalabipr/DynProfiler.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":null,"pages":null},"PeriodicalIF":2.4,"publicationDate":"2024-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11464416/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142402170","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Correction to: Utilizing biological experimental data and molecular dynamics for the classification of mutational hotspots through machine learning. Correction to:利用生物实验数据和分子动力学,通过机器学习对突变热点进行分类。
IF 2.4 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-10-04 eCollection Date: 2024-01-01 DOI: 10.1093/bioadv/vbae140

[This corrects the article DOI: 10.1093/bioadv/vbae125.].

[此处更正了文章 DOI:10.1093/bioadv/vbae125]。
{"title":"Correction to: Utilizing biological experimental data and molecular dynamics for the classification of mutational hotspots through machine learning.","authors":"","doi":"10.1093/bioadv/vbae140","DOIUrl":"https://doi.org/10.1093/bioadv/vbae140","url":null,"abstract":"<p><p>[This corrects the article DOI: 10.1093/bioadv/vbae125.].</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":null,"pages":null},"PeriodicalIF":2.4,"publicationDate":"2024-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11453097/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142382608","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
tidysbml: R/Bioconductor package for SBML extraction into dataframes. tidysbml:用于将 SBML 提取到数据帧中的 R/Bioconductor 软件包。
IF 2.4 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-10-03 eCollection Date: 2024-01-01 DOI: 10.1093/bioadv/vbae148
Veronica Paparozzi, Christine Nardini

Summary: We present tidysbml, an R package able to perform compartments, species, and reactions data extraction from Systems Biology Markup Language (SBML) documents (up to Level 3) in tabular data structures (i.e. R dataframes) to easily access and handle the richness of the biological information. Thanks to its output format, the package facilitates data manipulation, enabling manageable construction, and therefore analysis, of custom networks, as well as data retrieval, by means of R packages such as igraph, RCy3, and biomaRt. Exemplar data (i.e. SBML files) are extracted from Reactome.

Availability and implementation: The tidysbml R package is distributed under CC BY 4.0 License and can be found publicly available in Bioconductor (https://bioconductor.org/packages/tidysbml) and on GitHub (https://github.com/veronicapaparozzi/tidysbml).

摘要:我们介绍的 tidysbml 是一个 R 软件包,它能够以表格数据结构(即 R 数据框)从系统生物学标记语言(SBML)文档(最高 3 级)中提取区系、物种和反应数据,从而轻松访问和处理丰富的生物信息。得益于其输出格式,该软件包方便了数据操作,可通过 igraph、RCy3 和 biomaRt 等 R 软件包管理自定义网络的构建和分析,以及数据检索。 示例数据(即 SBML 文件)从 Reactome.Availability 和实现中提取:tidysbml R 软件包以 CC BY 4.0 许可发布,可在 Bioconductor (https://bioconductor.org/packages/tidysbml) 和 GitHub (https://github.com/veronicapaparozzi/tidysbml) 上公开获取。
{"title":"tidysbml: R/Bioconductor package for SBML extraction into dataframes.","authors":"Veronica Paparozzi, Christine Nardini","doi":"10.1093/bioadv/vbae148","DOIUrl":"https://doi.org/10.1093/bioadv/vbae148","url":null,"abstract":"<p><strong>Summary: </strong>We present <i>tidysbml</i>, an R package able to perform <i>compartments</i>, <i>species</i>, and <i>reactions</i> data extraction from Systems Biology Markup Language (SBML) documents (up to Level 3) in tabular data structures (i.e. R dataframes) to easily access and handle the richness of the biological information. Thanks to its output format, the package facilitates data manipulation, enabling manageable construction, and therefore analysis, of custom networks, as well as data retrieval, by means of R packages such as <i>igraph</i>, <i>RCy3</i>, and <i>biomaRt</i>. Exemplar data (i.e. SBML files) are extracted from Reactome.</p><p><strong>Availability and implementation: </strong>The <i>tidysbml</i> R package is distributed under CC BY 4.0 License and can be found publicly available in Bioconductor (https://bioconductor.org/packages/tidysbml) and on GitHub (https://github.com/veronicapaparozzi/tidysbml).</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":null,"pages":null},"PeriodicalIF":2.4,"publicationDate":"2024-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11479578/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142482213","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A modern multi-omics data exploration experience with Panomicon. 利用 Panomicon 体验现代多组学数据探索。
IF 2.4 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-10-03 eCollection Date: 2024-01-01 DOI: 10.1093/bioadv/vbae147
Rodolfo S Allendes Osorio, Yuji Kosugi, Johan T Nyström-Persson, Kenji Mizuguchi, Yayoi Natsume-Kitatani

Summary: To address the challenges of the storage, sharing, and analysis of multi-omics data, here we introduce the newest version of Panomicon, which includes the improvement of the underlying data model, the introduction of new registration and control access service, together with the seamless integration with other services (like TargetMine for data enrichment analysis), integrated in a completely new, more user friendly web application.

Availability and implementation: Panomicon is available online at https://panomicon.nibiohn.go.jp. Unregistered users can access the publicly available data uploaded to Panomicon using the following account: user: guest, password: anonymous. Source code for the application is also freely available under a GNU license at https://github.com/Toxygates/Panomicon/. A brief user guide for the new features of Panomicon is provided as supplementary material online.

摘要:为了解决多组学数据的存储、共享和分析难题,我们在此介绍 Panomicon 的最新版本,其中包括底层数据模型的改进、新注册和控制访问服务的引入,以及与其他服务(如用于数据富集分析的 TargetMine)的无缝集成,这些都集成在一个全新的、用户更友好的网络应用程序中:Panomicon 可通过 https://panomicon.nibiohn.go.jp 在线访问。未注册用户可使用以下账户访问上传到 Panomicon 的公开数据:用户:guest,密码:anonymous。应用程序的源代码也可在 GNU 许可证下免费获取,网址是 https://github.com/Toxygates/Panomicon/。有关 Panomicon 新功能的简要用户指南作为补充材料在线提供。
{"title":"A modern multi-omics data exploration experience with Panomicon.","authors":"Rodolfo S Allendes Osorio, Yuji Kosugi, Johan T Nyström-Persson, Kenji Mizuguchi, Yayoi Natsume-Kitatani","doi":"10.1093/bioadv/vbae147","DOIUrl":"https://doi.org/10.1093/bioadv/vbae147","url":null,"abstract":"<p><strong>Summary: </strong>To address the challenges of the storage, sharing, and analysis of multi-omics data, here we introduce the newest version of Panomicon, which includes the improvement of the underlying data model, the introduction of new registration and control access service, together with the seamless integration with other services (like TargetMine for data enrichment analysis), integrated in a completely new, more user friendly web application.</p><p><strong>Availability and implementation: </strong>Panomicon is available online at https://panomicon.nibiohn.go.jp. Unregistered users can access the publicly available data uploaded to Panomicon using the following account: user: guest, password: anonymous. Source code for the application is also freely available under a GNU license at https://github.com/Toxygates/Panomicon/. A brief user guide for the new features of Panomicon is provided as supplementary material online.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":null,"pages":null},"PeriodicalIF":2.4,"publicationDate":"2024-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11520228/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142549261","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
iTraNet: a web-based platform for integrated trans-omics network visualization and analysis. iTraNet:基于网络的跨组学网络可视化综合分析平台。
IF 2.4 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-09-30 eCollection Date: 2024-01-01 DOI: 10.1093/bioadv/vbae141
Hikaru Sugimoto, Keigo Morita, Dongzi Li, Yunfan Bai, Matthias Mattanovich, Shinya Kuroda

Motivation: Visualization and analysis of biological networks play crucial roles in understanding living systems. Biological networks include diverse types, from gene regulatory networks and protein-protein interactions to metabolic networks. Metabolic networks include substrates, products, and enzymes, which are regulated by allosteric mechanisms and gene expression. However, the analysis of these diverse omics types is challenging due to the diversity of databases and the complexity of network analysis.

Results: We developed iTraNet, a web application that visualizes and analyses trans-omics networks involving four types of networks: gene regulatory networks, protein-protein interactions, metabolic networks, and metabolite exchange networks. Using iTraNet, we found that in wild-type mice, hub molecules within the network tended to respond to glucose administration, whereas in ob/ob mice, this tendency disappeared. With its ability to facilitate network analysis, we anticipate that iTraNet will help researchers gain insights into living systems.

Availability and implementation: iTraNet is available at https://itranet.streamlit.app/.

动机生物网络的可视化和分析在了解生命系统方面发挥着至关重要的作用。生物网络包括多种类型,从基因调控网络、蛋白质-蛋白质相互作用到代谢网络。代谢网络包括底物、产物和酶,它们受到异构机制和基因表达的调控。然而,由于数据库的多样性和网络分析的复杂性,对这些不同类型的 omics 进行分析具有挑战性:我们开发了 iTraNet,它是一种网络应用程序,用于可视化和分析涉及四种类型网络的跨组学网络:基因调控网络、蛋白质-蛋白质相互作用、代谢网络和代谢物交换网络。利用 iTraNet,我们发现在野生型小鼠中,网络内的枢纽分子倾向于对葡萄糖给药做出反应,而在肥胖/肥胖小鼠中,这种倾向消失了。由于 iTraNet 能够促进网络分析,我们预计它将帮助研究人员深入了解生命系统。可用性和实施:iTraNet 可在 https://itranet.streamlit.app/ 上获取。
{"title":"iTraNet: a web-based platform for integrated trans-omics network visualization and analysis.","authors":"Hikaru Sugimoto, Keigo Morita, Dongzi Li, Yunfan Bai, Matthias Mattanovich, Shinya Kuroda","doi":"10.1093/bioadv/vbae141","DOIUrl":"https://doi.org/10.1093/bioadv/vbae141","url":null,"abstract":"<p><strong>Motivation: </strong>Visualization and analysis of biological networks play crucial roles in understanding living systems. Biological networks include diverse types, from gene regulatory networks and protein-protein interactions to metabolic networks. Metabolic networks include substrates, products, and enzymes, which are regulated by allosteric mechanisms and gene expression. However, the analysis of these diverse omics types is challenging due to the diversity of databases and the complexity of network analysis.</p><p><strong>Results: </strong>We developed iTraNet, a web application that visualizes and analyses trans-omics networks involving four types of networks: gene regulatory networks, protein-protein interactions, metabolic networks, and metabolite exchange networks. Using iTraNet, we found that in wild-type mice, hub molecules within the network tended to respond to glucose administration, whereas in <i>ob/ob</i> mice, this tendency disappeared. With its ability to facilitate network analysis, we anticipate that iTraNet will help researchers gain insights into living systems.</p><p><strong>Availability and implementation: </strong>iTraNet is available at https://itranet.streamlit.app/.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":null,"pages":null},"PeriodicalIF":2.4,"publicationDate":"2024-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11493990/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142513909","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Bioinformatics advances
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1