Pub Date : 2024-05-03eCollection Date: 2024-01-01DOI: 10.1093/bioadv/vbae040
Anna-Sophie Fiston-Lavier, Sandra Dérozier, Guy Perrière, Marie-France Sagot
{"title":"ISMB/ECCB 2023 organization benefited from the strengths of the French bioinformatics community.","authors":"Anna-Sophie Fiston-Lavier, Sandra Dérozier, Guy Perrière, Marie-France Sagot","doi":"10.1093/bioadv/vbae040","DOIUrl":"https://doi.org/10.1093/bioadv/vbae040","url":null,"abstract":"","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11076915/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140892264","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-02eCollection Date: 2024-01-01DOI: 10.1093/bioadv/vbae062
Tomás V Waichman, M L Vercesi, Ariel A Berardino, Maximiliano S Beckel, Damiana Giacomini, Natalí B Rasetto, Magalí Herrero, Daniela J Di Bella, Paola Arlotta, Alejandro F Schinder, Ariel Chernomoretz
Motivation: Single-cell RNA sequencing (scRNAseq) has transformed our ability to explore biological systems. Nevertheless, proficient expertise is essential for handling and interpreting the data.
Results: In this article, we present scX, an R package built on the Shiny framework that streamlines the analysis, exploration, and visualization of single-cell experiments. With an interactive graphic interface, implemented as a web application, scX provides easy access to key scRNAseq analyses, including marker identification, gene expression profiling, and differential gene expression analysis. Additionally, scX seamlessly integrates with commonly used single-cell Seurat and SingleCellExperiment R objects, resulting in efficient processing and visualization of varied datasets. Overall, scX serves as a valuable and user-friendly tool for effortless exploration and sharing of single-cell data, simplifying some of the complexities inherent in scRNAseq analysis.
Availability and implementation: Source code can be downloaded from https://github.com/chernolabs/scX. A docker image is available from dockerhub as chernolabs/scx.
{"title":"scX: a user-friendly tool for scRNAseq exploration.","authors":"Tomás V Waichman, M L Vercesi, Ariel A Berardino, Maximiliano S Beckel, Damiana Giacomini, Natalí B Rasetto, Magalí Herrero, Daniela J Di Bella, Paola Arlotta, Alejandro F Schinder, Ariel Chernomoretz","doi":"10.1093/bioadv/vbae062","DOIUrl":"10.1093/bioadv/vbae062","url":null,"abstract":"<p><strong>Motivation: </strong>Single-cell RNA sequencing (scRNAseq) has transformed our ability to explore biological systems. Nevertheless, proficient expertise is essential for handling and interpreting the data.</p><p><strong>Results: </strong>In this article, we present scX, an R package built on the Shiny framework that streamlines the analysis, exploration, and visualization of single-cell experiments. With an interactive graphic interface, implemented as a web application, scX provides easy access to key scRNAseq analyses, including marker identification, gene expression profiling, and differential gene expression analysis. Additionally, scX seamlessly integrates with commonly used single-cell Seurat and SingleCellExperiment R objects, resulting in efficient processing and visualization of varied datasets. Overall, scX serves as a valuable and user-friendly tool for effortless exploration and sharing of single-cell data, simplifying some of the complexities inherent in scRNAseq analysis.</p><p><strong>Availability and implementation: </strong>Source code can be downloaded from https://github.com/chernolabs/scX. A docker image is available from dockerhub as chernolabs/scx.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":null,"pages":null},"PeriodicalIF":2.4,"publicationDate":"2024-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11109472/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141082442","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Karen Ross, Frederic B Bastian, Matt Buys, Charles E Cook, Peter D'Eustachio, Melissa Harrison, H. Hermjakob, Donghui Li, Phillip Lord, Darren A Natale, Bjoern Peters, Paul W. Sternberg, Andrew I Su, Matthew Thakur, Paul D Thomas, Alex Bateman
Data reuse is a common and vital practice in molecular biology and enables the knowledge gathered over recent decades to drive discovery and innovation in the life sciences. Much of this knowledge has been collated into molecular biology databases, such as UniProtKB, and these resources derive enormous value from sharing data among themselves. However, quantifying and documenting this kind of data reuse remains a challenge. The paper reports on a one-day virtual workshop hosted by the UniProt Consortium in March 2023, attended by representatives from biodata resources, experts in data management, and NIH program managers. Workshop discussions focused on strategies for tracking data reuse, best practices for reusing data, and the challenges associated with data reuse and tracking. Surveys and discussions showed that data reuse is widespread, but critical information for reproducibility is sometimes lacking. Challenges include costs of tracking data reuse, tensions between tracking data and open sharing, restrictive licenses, and difficulties in tracking commercial data use. Recommendations that emerged from the discussion include: development of standardized formats for documenting data reuse, education about the obstacles posed by restrictive licenses, and continued recognition by funding agencies that data management is a critical activity that requires dedicated resources. Supplementary data are available at Bioinformatics Advances online.
{"title":"Perspectives on tracking data reuse across biodata resources","authors":"Karen Ross, Frederic B Bastian, Matt Buys, Charles E Cook, Peter D'Eustachio, Melissa Harrison, H. Hermjakob, Donghui Li, Phillip Lord, Darren A Natale, Bjoern Peters, Paul W. Sternberg, Andrew I Su, Matthew Thakur, Paul D Thomas, Alex Bateman","doi":"10.1093/bioadv/vbae057","DOIUrl":"https://doi.org/10.1093/bioadv/vbae057","url":null,"abstract":"\u0000 \u0000 \u0000 Data reuse is a common and vital practice in molecular biology and enables the knowledge gathered over recent decades to drive discovery and innovation in the life sciences. Much of this knowledge has been collated into molecular biology databases, such as UniProtKB, and these resources derive enormous value from sharing data among themselves. However, quantifying and documenting this kind of data reuse remains a challenge.\u0000 \u0000 \u0000 \u0000 The paper reports on a one-day virtual workshop hosted by the UniProt Consortium in March 2023, attended by representatives from biodata resources, experts in data management, and NIH program managers. Workshop discussions focused on strategies for tracking data reuse, best practices for reusing data, and the challenges associated with data reuse and tracking. Surveys and discussions showed that data reuse is widespread, but critical information for reproducibility is sometimes lacking. Challenges include costs of tracking data reuse, tensions between tracking data and open sharing, restrictive licenses, and difficulties in tracking commercial data use. Recommendations that emerged from the discussion include: development of standardized formats for documenting data reuse, education about the obstacles posed by restrictive licenses, and continued recognition by funding agencies that data management is a critical activity that requires dedicated resources.\u0000 \u0000 \u0000 \u0000 Supplementary data are available at Bioinformatics Advances online.\u0000","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140656991","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-24eCollection Date: 2024-01-01DOI: 10.1093/bioadv/vbae061
Jose L Figueroa, Andrew Redinbo, Ajay Panyala, Sean Colby, Maren L Friesen, Lisa Tiemann, Richard Allen White
Motivation: MerCat2 ("Mer-Catenate2") is a versatile, parallel, scalable and modular property software package for robustly analyzing features in omics data. Using massively parallel sequencing raw reads, assembled contigs, and protein sequences from any platform as input, MerCat2 performs k-mer counting of any length k, resulting in feature abundance counts tables, quality control reports, protein feature metrics, and graphical representation (i.e. principal component analysis (PCA)).
Results: MerCat2 allows for direct analysis of data properties in a database-independent manner that initializes all data, which other profilers and assembly-based methods cannot perform. MerCat2 represents an integrated tool to illuminate omics data within a sample for rapid cross-examination and comparisons.
Availability and implementation: MerCat2 is written in Python and distributed under a BSD-3 license. The source code of MerCat2 is freely available at https://github.com/raw-lab/mercat2. MerCat2 is compatible with Python 3 on Mac OS X and Linux. MerCat2 can also be easily installed using bioconda: mamba create -n mercat2 -c conda-forge -c bioconda mercat2.
动机MerCat2("Mer-Catenate2")是一个多功能、并行、可扩展和模块化的属性软件包,用于对omics数据中的特征进行稳健分析。MerCat2 使用来自任何平台的大规模并行测序原始读数、组装 contigs 和蛋白质序列作为输入,执行任意长度 k 的 k-mer 计数,生成特征丰度计数表、质量控制报告、蛋白质特征度量和图形表示(即主成分分析 (PCA)):结果:MerCat2 允许以独立于数据库的方式直接分析数据属性,并对所有数据进行初始化,这是其他剖析器和基于组装的方法无法做到的。MerCat2 是一种综合工具,可用于快速交叉检验和比较样本中的组学数据:MerCat2 由 Python 编写,采用 BSD-3 许可发布。MerCat2 的源代码可在 https://github.com/raw-lab/mercat2 免费获取。MerCat2 与 Mac OS X 和 Linux 上的 Python 3 兼容。使用 bioconda 也能轻松安装 MerCat2:mamba create -n mercat2 -c conda-forge -c bioconda mercat2。
{"title":"MerCat2: a versatile <i>k</i>-mer counter and diversity estimator for database-independent property analysis obtained from omics data.","authors":"Jose L Figueroa, Andrew Redinbo, Ajay Panyala, Sean Colby, Maren L Friesen, Lisa Tiemann, Richard Allen White","doi":"10.1093/bioadv/vbae061","DOIUrl":"10.1093/bioadv/vbae061","url":null,"abstract":"<p><strong>Motivation: </strong>MerCat2 (\"Mer-Catenate2\") is a versatile, parallel, scalable and modular property software package for robustly analyzing features in omics data. Using massively parallel sequencing raw reads, assembled contigs, and protein sequences from any platform as input, MerCat2 performs <i>k</i>-mer counting of any length <i>k</i>, resulting in feature abundance counts tables, quality control reports, protein feature metrics, and graphical representation (i.e. principal component analysis (PCA)).</p><p><strong>Results: </strong>MerCat2 allows for direct analysis of data properties in a database-independent manner that initializes all data, which other profilers and assembly-based methods cannot perform. MerCat2 represents an integrated tool to illuminate omics data within a sample for rapid cross-examination and comparisons.</p><p><strong>Availability and implementation: </strong>MerCat2 is written in Python and distributed under a BSD-3 license. The source code of MerCat2 is freely available at https://github.com/raw-lab/mercat2. MerCat2 is compatible with Python 3 on Mac OS X and Linux. MerCat2 can also be easily installed using bioconda: mamba create -n mercat2 -c conda-forge -c bioconda mercat2.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-04-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11090762/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140923738","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Many diseases are complex heterogeneous conditions that affect multiple organs in the body and depend on the interplay between several factors that include molecular and environmental factors, requiring a holistic approach to better understand disease pathobiology. Most existing methods for integrating data from multiple sources and classifying individuals into one of multiple classes or disease groups have mainly focused on linear relationships despite the complexity of these relationships. On the other hand, methods for nonlinear association and classification studies are limited in their ability to identify variables to aid in our understanding of the complexity of the disease or can be applied to only two data types. We propose Deep IDA (Integrative Discriminant Analysis), a deep learning method to learn complex nonlinear transformations of two or more views such that resulting projections have maximum association and maximum separation. Further, we propose a feature ranking approach based on ensemble learning for interpretatble results. We test Deep IDA on both simulated data and two large real-world datasets, including RNA sequencing, metabolomics, and proteomics data pertaining to COVID-19 severity. We identified signatures that better discriminated COVID-19 patient groups, and related to neurological conditions, cancer, and metabolic diseases, corroborating current research findings and heightening the need to study the post sequelae effects of COVID-19 to devise effective treatments and to improve patient care. Our algorithms are implemented in PyTorch and available at: https://github.com/JiuzhouW/DeepIDA. Supplementary materials are available at Bioinformatics Advances online
{"title":"Deep IDA: A Deep Learning Approach for Integrative Discriminant Analysis of Multi-omics Data with Feature Ranking- An Application to COVID-19","authors":"Jiuzhou Wang, S. Safo","doi":"10.1093/bioadv/vbae060","DOIUrl":"https://doi.org/10.1093/bioadv/vbae060","url":null,"abstract":"\u0000 \u0000 \u0000 Many diseases are complex heterogeneous conditions that affect multiple organs in the body and depend on the interplay between several factors that include molecular and environmental factors, requiring a holistic approach to better understand disease pathobiology. Most existing methods for integrating data from multiple sources and classifying individuals into one of multiple classes or disease groups have mainly focused on linear relationships despite the complexity of these relationships. On the other hand, methods for nonlinear association and classification studies are limited in their ability to identify variables to aid in our understanding of the complexity of the disease or can be applied to only two data types.\u0000 \u0000 \u0000 \u0000 We propose Deep IDA (Integrative Discriminant Analysis), a deep learning method to learn complex nonlinear transformations of two or more views such that resulting projections have maximum association and maximum separation. Further, we propose a feature ranking approach based on ensemble learning for interpretatble results. We test Deep IDA on both simulated data and two large real-world datasets, including RNA sequencing, metabolomics, and proteomics data pertaining to COVID-19 severity. We identified signatures that better discriminated COVID-19 patient groups, and related to neurological conditions, cancer, and metabolic diseases, corroborating current research findings and heightening the need to study the post sequelae effects of COVID-19 to devise effective treatments and to improve patient care.\u0000 \u0000 \u0000 \u0000 Our algorithms are implemented in PyTorch and available at: https://github.com/JiuzhouW/DeepIDA.\u0000 \u0000 \u0000 \u0000 Supplementary materials are available at Bioinformatics Advances online\u0000","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-04-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140659998","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-19eCollection Date: 2024-01-01DOI: 10.1093/bioadv/vbae059
Alexander R Bennett, Daniel Bojar
Motivation: Structural analysis of glycans poses significant challenges in glycobiology due to their complex sequences. Research questions such as analyzing the sequence content of the α1-6 branch in N-glycans, are biologically meaningful yet can be hard to automate.
Results: Here, we introduce a regular expression system, designed for glycans, feature-complete, and closely aligned with regular expression formatting. We use this to annotate glycan motifs of arbitrary complexity, perform differential expression analysis on designated sequence stretches, or elucidate branch-specific binding specificities of lectins in an automated manner. We are confident that glycan regular expressions will empower computational analyses of these sequences.
Availability and implementation: Our regular expression framework for glycans is implemented in Python and is incorporated into the open-source glycowork package (version 1.1+). Code and documentation are available at https://github.com/BojarLab/glycowork/blob/master/glycowork/motif/regex.py.
{"title":"Syntactic sugars: crafting a regular expression framework for glycan structures.","authors":"Alexander R Bennett, Daniel Bojar","doi":"10.1093/bioadv/vbae059","DOIUrl":"https://doi.org/10.1093/bioadv/vbae059","url":null,"abstract":"<p><strong>Motivation: </strong>Structural analysis of glycans poses significant challenges in glycobiology due to their complex sequences. Research questions such as analyzing the sequence content of the α1-6 branch in <i>N</i>-glycans, are biologically meaningful yet can be hard to automate.</p><p><strong>Results: </strong>Here, we introduce a regular expression system, designed for glycans, feature-complete, and closely aligned with regular expression formatting. We use this to annotate glycan motifs of arbitrary complexity, perform differential expression analysis on designated sequence stretches, or elucidate branch-specific binding specificities of lectins in an automated manner. We are confident that glycan regular expressions will empower computational analyses of these sequences.</p><p><strong>Availability and implementation: </strong>Our regular expression framework for glycans is implemented in Python and is incorporated into the open-source glycowork package (version 1.1+). Code and documentation are available at https://github.com/BojarLab/glycowork/blob/master/glycowork/motif/regex.py.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11069104/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140873530","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Summary: The revised WHO guidelines for classifying and grading brain tumors include several copy number variation (CNV) markers. The turnaround time for detecting CNVs and alterations throughout the entire genome is drastically reduced with the customized read incremental approach on the nanopore platform. However, this approach is challenging for non-bioinformaticians due to the need to use multiple software tools, extract CNV markers and interpret results, which creates barriers due to the time and specialized resources that are necessary. To address this problem and help clinicians classify and grade brain tumors, we developed GLIMMERS: glioma molecular markers exploration using long-read sequencing, an open-access tool that automatically analyzes nanopore-based CNV data and generates simplified reports.
Availability and implementation: GLIMMERS is available at https://gitlab.com/silol_public/glimmers under the terms of the MIT license.
{"title":"GLIMMERS: glioma molecular markers exploration using long-read sequencing.","authors":"Wichayapat Thongrattana, Tantip Arigul, Bhoom Suktitipat, Manop Pithukpakorn, Sith Sathornsumetee, Thidathip Wongsurawat, Piroon Jenjaroenpun","doi":"10.1093/bioadv/vbae058","DOIUrl":"10.1093/bioadv/vbae058","url":null,"abstract":"<p><strong>Summary: </strong>The revised WHO guidelines for classifying and grading brain tumors include several copy number variation (CNV) markers. The turnaround time for detecting CNVs and alterations throughout the entire genome is drastically reduced with the customized read incremental approach on the nanopore platform. However, this approach is challenging for non-bioinformaticians due to the need to use multiple software tools, extract CNV markers and interpret results, which creates barriers due to the time and specialized resources that are necessary. To address this problem and help clinicians classify and grade brain tumors, we developed GLIMMERS: glioma molecular markers exploration using long-read sequencing, an open-access tool that automatically analyzes nanopore-based CNV data and generates simplified reports.</p><p><strong>Availability and implementation: </strong>GLIMMERS is available at https://gitlab.com/silol_public/glimmers under the terms of the MIT license.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11087932/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140913557","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mass spectrometry-based system proteomics allows identification of dysregulated protein hubs and associated disease-related features. Obtaining differentially expressed proteins (DEPs) is the most important step of downstream bioinformatics analysis. However, the extraction of statistically significant DEPs from datasets with multiple experimental conditions or disease types through currently available tools remains a laborious task. More often such an analysis requires considerable bioinformatics expertise, making it inaccessible to researchers with limited computational analytics experience. To uncover the differences among the many conditions within the data in a user-friendly manner, here we introduce FlexStat, a web-based interface that extracts DEPs through combinatory analysis. This tool accepts a protein expression matrix as input and systematically generates DEP results for every conceivable combination of various experimental conditions or disease types. FlexStat includes a suite of robust statistical tools for data preprocessing, in addition to DEP extraction, and publication-ready visualization, which are built on established R scientific libraries in an automated manner. This analytics suite was validated in diverse public proteomic datasets to showcase its high performance of rapid and simultaneous pairwise comparisons of comprehensive data sets. FlexStat is implemented in R and is freely available at https://jglab.shinyapps.io/flexstatv1-pipeline-only/. The source code is accessible at https://github.com/kts-desilva/FlexStat/tree/main. Supplementary data are available at Bioinformatics Advances online.
基于质谱技术的系统蛋白质组学可以识别调控失调的蛋白质中心和相关的疾病特征。获取差异表达蛋白(DEPs)是下游生物信息学分析最重要的一步。然而,通过现有工具从具有多种实验条件或疾病类型的数据集中提取具有统计学意义的差异表达蛋白仍然是一项艰巨的任务。这种分析往往需要大量的生物信息学专业知识,这使得计算分析经验有限的研究人员无法胜任。 为了以用户友好的方式揭示数据中多种条件之间的差异,我们在此介绍 FlexStat,这是一种基于网络的界面,可通过组合分析提取 DEPs。该工具接受蛋白质表达矩阵作为输入,并为各种实验条件或疾病类型的每一种可想象的组合系统地生成 DEP 结果。FlexStat 包括一套强大的统计工具,用于数据预处理、DEP 提取和可发布的可视化,这些工具都是以自动化方式建立在成熟的 R 科学库上。该分析套件已在各种公共蛋白质组数据集中进行了验证,以展示其对综合数据集进行快速、同步配对比较的高性能。 FlexStat 使用 R 语言实现,可在 https://jglab.shinyapps.io/flexstatv1-pipeline-only/ 免费获取。源代码可从 https://github.com/kts-desilva/FlexStat/tree/main 获取。 补充数据可在 Bioinformatics Advances 在线查阅。
{"title":"FlexStat: Combinatory differentially expressed protein extraction","authors":"Senuri De Silva, Asfa Alli-Shaik, J. Gunaratne","doi":"10.1093/bioadv/vbae056","DOIUrl":"https://doi.org/10.1093/bioadv/vbae056","url":null,"abstract":"\u0000 \u0000 \u0000 Mass spectrometry-based system proteomics allows identification of dysregulated protein hubs and associated disease-related features. Obtaining differentially expressed proteins (DEPs) is the most important step of downstream bioinformatics analysis. However, the extraction of statistically significant DEPs from datasets with multiple experimental conditions or disease types through currently available tools remains a laborious task. More often such an analysis requires considerable bioinformatics expertise, making it inaccessible to researchers with limited computational analytics experience.\u0000 \u0000 \u0000 \u0000 To uncover the differences among the many conditions within the data in a user-friendly manner, here we introduce FlexStat, a web-based interface that extracts DEPs through combinatory analysis. This tool accepts a protein expression matrix as input and systematically generates DEP results for every conceivable combination of various experimental conditions or disease types. FlexStat includes a suite of robust statistical tools for data preprocessing, in addition to DEP extraction, and publication-ready visualization, which are built on established R scientific libraries in an automated manner. This analytics suite was validated in diverse public proteomic datasets to showcase its high performance of rapid and simultaneous pairwise comparisons of comprehensive data sets.\u0000 \u0000 \u0000 \u0000 FlexStat is implemented in R and is freely available at https://jglab.shinyapps.io/flexstatv1-pipeline-only/. The source code is accessible at https://github.com/kts-desilva/FlexStat/tree/main.\u0000 \u0000 \u0000 \u0000 Supplementary data are available at Bioinformatics Advances online.\u0000","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140713192","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract Summary Chromatin accessibility serves as a critical measurement of physical contact between nuclear macromolecules and DNA sequence, providing valuable insights into the comprehensive landscape of regulatory mechanisms, thus we previously developed the OpenAnnotate web server. However, as an increasing number of epigenomic analysis software tools emerged, web-based annotation often faced limitations and inconveniences when integrated into these software pipelines. To address these issues, we here develop two software packages named OpenAnnotatePy and OpenAnnotateR. In addition to web-based functionalities, these packages encompass supplementary features, including the capability for simultaneous annotation across multiple cell types, advanced searching of systems, tissues and cell types, and converting the result to the data structure of mainstream tools. Moreover, we applied the packages to various scenarios, including cell type revealing, regulatory element prediction, and integration into mainstream single-cell ATAC-seq analysis pipelines including EpiScanpy, Signac, and ArchR. We anticipate that OpenAnnotateApi will significantly facilitate the deciphering of gene regulatory mechanisms, and offer crucial assistance in the field of epigenomic studies. Availability and implementation OpenAnnotateApi for R is available at https://github.com/ZjGaothu/OpenAnnotateR and for Python is available at https://github.com/ZjGaothu/OpenAnnotatePy.
{"title":"OpenAnnotateApi: Python and R packages to efficiently annotate and analyze chromatin accessibility of genomic regions","authors":"Zijing Gao, Rui Jiang, Shengquan Chen","doi":"10.1093/bioadv/vbae055","DOIUrl":"https://doi.org/10.1093/bioadv/vbae055","url":null,"abstract":"Abstract Summary Chromatin accessibility serves as a critical measurement of physical contact between nuclear macromolecules and DNA sequence, providing valuable insights into the comprehensive landscape of regulatory mechanisms, thus we previously developed the OpenAnnotate web server. However, as an increasing number of epigenomic analysis software tools emerged, web-based annotation often faced limitations and inconveniences when integrated into these software pipelines. To address these issues, we here develop two software packages named OpenAnnotatePy and OpenAnnotateR. In addition to web-based functionalities, these packages encompass supplementary features, including the capability for simultaneous annotation across multiple cell types, advanced searching of systems, tissues and cell types, and converting the result to the data structure of mainstream tools. Moreover, we applied the packages to various scenarios, including cell type revealing, regulatory element prediction, and integration into mainstream single-cell ATAC-seq analysis pipelines including EpiScanpy, Signac, and ArchR. We anticipate that OpenAnnotateApi will significantly facilitate the deciphering of gene regulatory mechanisms, and offer crucial assistance in the field of epigenomic studies. Availability and implementation OpenAnnotateApi for R is available at https://github.com/ZjGaothu/OpenAnnotateR and for Python is available at https://github.com/ZjGaothu/OpenAnnotatePy.","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140720228","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-09eCollection Date: 2024-01-01DOI: 10.1093/bioadv/vbae050
[This corrects the article DOI: 10.1093/bioadv/vbad136.].
[此处更正了文章 DOI:10.1093/bioadv/vbad136]。
{"title":"Correction to: w<i>TSA-CRAFT</i>: an open-access web server for rapid analysis of thermal shift assay experiments.","authors":"","doi":"10.1093/bioadv/vbae050","DOIUrl":"https://doi.org/10.1093/bioadv/vbae050","url":null,"abstract":"<p><p>[This corrects the article DOI: 10.1093/bioadv/vbad136.].</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-04-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11004552/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140869477","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}