Summary: Although multiple neural networks have been proposed for detecting secondary structures from medium-resolution (5-10 Å) cryo-electron microscopy (cryo-EM) maps, the loss functions used in the existing deep learning networks are primarily based on cross-entropy loss, which is known to be sensitive to class imbalances. We investigated five loss functions: cross-entropy, Focal loss, Dice loss, and two combined loss functions. Using a U-Net architecture in our DeepSSETracer method and a dataset composed of 1355 box-cropped atomic-structure/density-map pairs, we found that a newly designed loss function that combines Focal loss and Dice loss provides the best overall detection accuracy for secondary structures. For β-sheet voxels, which are generally much harder to detect than helix voxels, the combined loss function achieved a significant improvement (an 8.8% increase in the F1 score) compared to the cross-entropy loss function and a noticeable improvement from the Dice loss function. This study demonstrates the potential for designing more effective loss functions for hard cases in the segmentation of secondary structures. The newly trained model was incorporated into DeepSSETracer 1.1 for the segmentation of protein secondary structures in medium-resolution cryo-EM map components. DeepSSETracer can be integrated into ChimeraX, a popular molecular visualization software.
Availability and implementation: https://www.cs.odu.edu/∼bioinfo/B2I_Tools/.
{"title":"The combined focal loss and dice loss function improves the segmentation of beta-sheets in medium-resolution cryo-electron-microscopy density maps.","authors":"Yongcheng Mu, Thu Nguyen, Bryan Hawickhorst, Willy Wriggers, Jiangwen Sun, Jing He","doi":"10.1093/bioadv/vbae169","DOIUrl":"10.1093/bioadv/vbae169","url":null,"abstract":"<p><strong>Summary: </strong>Although multiple neural networks have been proposed for detecting secondary structures from medium-resolution (5-10 Å) cryo-electron microscopy (cryo-EM) maps, the loss functions used in the existing deep learning networks are primarily based on cross-entropy loss, which is known to be sensitive to class imbalances. We investigated five loss functions: cross-entropy, Focal loss, Dice loss, and two combined loss functions. Using a U-Net architecture in our DeepSSETracer method and a dataset composed of 1355 box-cropped atomic-structure/density-map pairs, we found that a newly designed loss function that combines Focal loss and Dice loss provides the best overall detection accuracy for secondary structures. For β-sheet voxels, which are generally much harder to detect than helix voxels, the combined loss function achieved a significant improvement (an 8.8% increase in the F<sub>1</sub> score) compared to the cross-entropy loss function and a noticeable improvement from the Dice loss function. This study demonstrates the potential for designing more effective loss functions for hard cases in the segmentation of secondary structures. The newly trained model was incorporated into DeepSSETracer 1.1 for the segmentation of protein secondary structures in medium-resolution cryo-EM map components. DeepSSETracer can be integrated into ChimeraX, a popular molecular visualization software.</p><p><strong>Availability and implementation: </strong>https://www.cs.odu.edu/∼bioinfo/B2I_Tools/.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"4 1","pages":"vbae169"},"PeriodicalIF":2.4,"publicationDate":"2024-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11590252/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142735054","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-22eCollection Date: 2024-01-01DOI: 10.1093/bioadv/vbae185
Patrick König, Anne Fiebig, Thomas Münch, Björn Grüning, Uwe Scholz
Motivation: The Galaxy workflow system is an open-source platform supporting data-intensive research in life sciences, featuring a user-friendly web interface for complex analyses without extensive programming. It also offers a representational state transfer based API, enabling remote execution of specific tools. Galaxy supports similarity searches for nucleotide and amino acid sequences, with integrated tools like NCBI BLAST+ and DIAMOND. However, no specialized software currently exists for convenient use of NCBI BLAST+ and DIAMOND via the Galaxy API.
Results: blast2galaxy is a Python package that uses the Galaxy API to run sequence alignments with NCBI BLAST+ and DIAMOND as Galaxy-wrapped tools on compatible servers. It includes a command-line interface that mirrors the CLI of BLAST+ and DIAMOND and a high-level Python API for direct alignments from Python applications. The package relies on bioblend for communication with the Galaxy API.
Availability and implementation: blast2galaxy is available as open-source software under the MIT license. The source code is available on Github: https://github.com/IPK-BIT/blast2galaxy. It can be installed from the Python Package Index using "pip install blast2galaxy" or from the Bioconda channel using "conda install -c bioconda blast2galaxy". Docker and Apptainer images are available and referenced in the documentation which is available under https://blast2galaxy.readthedocs.io.
{"title":"blast2galaxy: a CLI and Python API for BLAST+ and DIAMOND searches on Galaxy servers.","authors":"Patrick König, Anne Fiebig, Thomas Münch, Björn Grüning, Uwe Scholz","doi":"10.1093/bioadv/vbae185","DOIUrl":"10.1093/bioadv/vbae185","url":null,"abstract":"<p><strong>Motivation: </strong>The Galaxy workflow system is an open-source platform supporting data-intensive research in life sciences, featuring a user-friendly web interface for complex analyses without extensive programming. It also offers a representational state transfer based API, enabling remote execution of specific tools. Galaxy supports similarity searches for nucleotide and amino acid sequences, with integrated tools like NCBI BLAST+ and DIAMOND. However, no specialized software currently exists for convenient use of NCBI BLAST+ and DIAMOND via the Galaxy API.</p><p><strong>Results: </strong>blast2galaxy is a Python package that uses the Galaxy API to run sequence alignments with NCBI BLAST+ and DIAMOND as Galaxy-wrapped tools on compatible servers. It includes a command-line interface that mirrors the CLI of BLAST+ and DIAMOND and a high-level Python API for direct alignments from Python applications. The package relies on bioblend for communication with the Galaxy API.</p><p><strong>Availability and implementation: </strong>blast2galaxy is available as open-source software under the MIT license. The source code is available on Github: https://github.com/IPK-BIT/blast2galaxy. It can be installed from the Python Package Index using \"pip install blast2galaxy\" or from the Bioconda channel using \"conda install -c bioconda blast2galaxy\". Docker and Apptainer images are available and referenced in the documentation which is available under https://blast2galaxy.readthedocs.io.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"4 1","pages":"vbae185"},"PeriodicalIF":2.4,"publicationDate":"2024-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11629687/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142808680","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-22eCollection Date: 2025-01-01DOI: 10.1093/bioadv/vbae173
Kai Wang, Yueming Hu, Sida Li, Ming Chen, Zhong Li
Motivation: Much evidence suggests that the subcellular localization of long-stranded noncoding RNAs (LncRNAs) provides key insights for the study of their biological function.
Results: This study proposes a novel deep learning framework, LncLSTA, designed for predicting the subcellular localization of LncRNAs. It firstly exploits LncRNA sequence, electron-ion interaction pseudopotentials, and nucleotide chemical property as feature inputs. Departing from conventional k-mer approaches, this model uses a set of 1D convolutional and maxpooling operations for dynamical feature aggregation. Furthermore, LncLSTA integrates a long-short term attention module with a bidirectional long and short term memory network to comprehensively extract sequence information. In addition, it incorporates a TextCNN module to enhance accuracy and robustness in subcellular localization tasks. Experimental results demonstrate the efficacy of LncLSTA, showcasing its superior performance compared to other state-of-the-art methods. Notably, LncLSTA exhibits the transfer learning capability, extending its utility to predict the subcellular localization prediction of mRNAs, while maintaining consistently satisfactory prediction results. This research contributes valuable insights into understanding the biological functions of LncRNAs through subcellular localization, emphasizing the potential of deep learning approaches in advancing RNA-related studies.
Availability and implementation: The source code is publicly available at https://bis.zju.edu.cn/LncLSTA.
{"title":"LncLSTA: a versatile predictor unveiling subcellular localization of lncRNAs through long-short term attention.","authors":"Kai Wang, Yueming Hu, Sida Li, Ming Chen, Zhong Li","doi":"10.1093/bioadv/vbae173","DOIUrl":"https://doi.org/10.1093/bioadv/vbae173","url":null,"abstract":"<p><strong>Motivation: </strong>Much evidence suggests that the subcellular localization of long-stranded noncoding RNAs (LncRNAs) provides key insights for the study of their biological function.</p><p><strong>Results: </strong>This study proposes a novel deep learning framework, LncLSTA, designed for predicting the subcellular localization of LncRNAs. It firstly exploits LncRNA sequence, electron-ion interaction pseudopotentials, and nucleotide chemical property as feature inputs. Departing from conventional <i>k</i>-mer approaches, this model uses a set of 1D convolutional and maxpooling operations for dynamical feature aggregation. Furthermore, LncLSTA integrates a long-short term attention module with a bidirectional long and short term memory network to comprehensively extract sequence information. In addition, it incorporates a TextCNN module to enhance accuracy and robustness in subcellular localization tasks. Experimental results demonstrate the efficacy of LncLSTA, showcasing its superior performance compared to other state-of-the-art methods. Notably, LncLSTA exhibits the transfer learning capability, extending its utility to predict the subcellular localization prediction of mRNAs, while maintaining consistently satisfactory prediction results. This research contributes valuable insights into understanding the biological functions of LncRNAs through subcellular localization, emphasizing the potential of deep learning approaches in advancing RNA-related studies.</p><p><strong>Availability and implementation: </strong>The source code is publicly available at https://bis.zju.edu.cn/LncLSTA.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"5 1","pages":"vbae173"},"PeriodicalIF":2.4,"publicationDate":"2024-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11700581/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142933930","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-20eCollection Date: 2025-01-01DOI: 10.1093/bioadv/vbae183
Dany Domínguez-Pérez, Guillermin Agüero-Chapin, Serena Leone, Maria Vittoria Modica
Motivation: Accurate sequence length profiling is essential in bioinformatics, particularly in genomics and proteomics. Existing tools like SeqKit and the Trinity toolkit provide basic sequence statistics but often fall short in offering comprehensive analytics and plotting options. For instance, SeqKit is a very complete and fast tool for sequence analysis, delivering useful metrics (e.g. number of sequences, average, minimum, and maximum lengths) and can return sequences either shorter or longer (but not both at once) for a given length. Similarly, Trinity's Perl-based scripts provide detailed contig length distributions (e.g. N50, median, and average lengths) but do not include the total number of sequences or offer graphical representations of the data.
Results: Given that key sequence analysis tasks are often distributed across multiple tools, we introduce SeqLengthPlot v2.0, an all-in-one, easy-to-use Python-based tool. Through a simple command-line interface, this straightforward tool enables users to split input FASTA files (nucleotide and protein) into two distinct files based on a customizable sequence length cutoff. It also automatically retrieves the resulting FASTA files, generates length distribution plots, and provides comprehensive statistical summaries.
Availability and implementation: SeqLengthPlot_v2.0.2 can be accessed at https://github.com/danydguezperez/SeqLengthPlot/releases/tag/v2.0.2.
{"title":"SeqLengthPlot v2.0: an all-in-one, easy-to-use tool for visualizing and retrieving sequence lengths from FASTA files.","authors":"Dany Domínguez-Pérez, Guillermin Agüero-Chapin, Serena Leone, Maria Vittoria Modica","doi":"10.1093/bioadv/vbae183","DOIUrl":"10.1093/bioadv/vbae183","url":null,"abstract":"<p><strong>Motivation: </strong>Accurate sequence length profiling is essential in bioinformatics, particularly in genomics and proteomics. Existing tools like SeqKit and the Trinity toolkit provide basic sequence statistics but often fall short in offering comprehensive analytics and plotting options. For instance, SeqKit is a very complete and fast tool for sequence analysis, delivering useful metrics (e.g. number of sequences, average, minimum, and maximum lengths) and can return sequences either shorter or longer (but not both at once) for a given length. Similarly, Trinity's Perl-based scripts provide detailed contig length distributions (e.g. N50, median, and average lengths) but do not include the total number of sequences or offer graphical representations of the data.</p><p><strong>Results: </strong>Given that key sequence analysis tasks are often distributed across multiple tools, we introduce <b>SeqLengthPlot v2.0</b>, an all-in-one, easy-to-use Python-based tool. Through a simple command-line interface, this straightforward tool enables users to split input FASTA files (nucleotide and protein) into two distinct files based on a customizable sequence length cutoff. It also automatically retrieves the resulting FASTA files, generates length distribution plots, and provides comprehensive statistical summaries.</p><p><strong>Availability and implementation: </strong>SeqLengthPlot_v2.0.2 can be accessed at https://github.com/danydguezperez/SeqLengthPlot/releases/tag/v2.0.2.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"5 1","pages":"vbae183"},"PeriodicalIF":2.4,"publicationDate":"2024-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11671033/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142904226","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-19eCollection Date: 2024-01-01DOI: 10.1093/bioadv/vbae179
Jin-Ok Lee, Sejoon Lee, Dongyoon Lee, Taeyeon Hwang, Soobok Joe, Jin Ok Yang, Jibin Jeong, Jung Hun Ohn, Jee Hyun Kim
Summary: Transposable elements (TEs), commonly referred to as "mobile elements," constitute DNA segments capable of relocating within a genome. Initially disregarded as "junk DNA" devoid of specific functionality, it has become evident that TEs have diverse influences on an organism's biology and health. The impact of these elements varies according to their location, classification, and their effects on specific genes or regulatory components. Despite their significant roles, a paucity of resources concerning TEs in population-scale genome sequencing remains. Herein, we analyze whole-genome sequencing data sourced from the Korean Genome and Epidemiology Study, encompassing 2500 Korean individuals. To facilitate convenient data access and observation, we developed a web-based database, KTED. Additionally, we scrutinized the differential distributions of TEs across five distinct common disease groups: dyslipidemia, hypertension, diabetes, thyroid disease, and cancer.
Availability and implementation: https://snubh.shinyapps.io/KTED.
{"title":"KTED: a comprehensive web-based database for transposable elements in the Korean genome.","authors":"Jin-Ok Lee, Sejoon Lee, Dongyoon Lee, Taeyeon Hwang, Soobok Joe, Jin Ok Yang, Jibin Jeong, Jung Hun Ohn, Jee Hyun Kim","doi":"10.1093/bioadv/vbae179","DOIUrl":"10.1093/bioadv/vbae179","url":null,"abstract":"<p><strong>Summary: </strong>Transposable elements (TEs), commonly referred to as \"mobile elements,\" constitute DNA segments capable of relocating within a genome. Initially disregarded as \"junk DNA\" devoid of specific functionality, it has become evident that TEs have diverse influences on an organism's biology and health. The impact of these elements varies according to their location, classification, and their effects on specific genes or regulatory components. Despite their significant roles, a paucity of resources concerning TEs in population-scale genome sequencing remains. Herein, we analyze whole-genome sequencing data sourced from the Korean Genome and Epidemiology Study, encompassing 2500 Korean individuals. To facilitate convenient data access and observation, we developed a web-based database, KTED. Additionally, we scrutinized the differential distributions of TEs across five distinct common disease groups: dyslipidemia, hypertension, diabetes, thyroid disease, and cancer.</p><p><strong>Availability and implementation: </strong>https://snubh.shinyapps.io/KTED.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"4 1","pages":"vbae179"},"PeriodicalIF":2.4,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11652267/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142857185","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-18eCollection Date: 2024-01-01DOI: 10.1093/bioadv/vbae166
Cath Brooksbank, Michelle D Brazas, Nicola Mulder, Russell Schwartz, Verena Ras, Sarah L Morgan, Marta Lloret Llinares, Patricia Carvajal López, Lee Larcombe, Amel Ghouila, Tom Hancocks, Venkata Satagopam, Javier De Las Rivas, Gaston Mazandu, Bruno Gaeta
Motivation: Developing competency in the broad area of bioinformatics is challenging globally, owing to the breadth of the field and the diversity of its audiences for education and training. Course design can be facilitated by the use of a competency framework-a set of competency requirements that define the knowledge, skills and attitudes needed by individuals in (or aspiring to be in) a particular profession or role. These competency requirements can help to define curricula as they can inform both the content and level to which competency needs to be developed. The International Society for Computational Biology (ISCB) developed a list of bioinformatics competencies in 2014, and these have undergone several rounds of improvement. In consultation with a broad bioinformatics training community, these have now been further refined and extended to include knowledge skills and attitudes, and mappings to previous and other existing competency frameworks.
Results: Here, we present version 3 of the ISCB competency framework. We describe how it was developed and how to access it, as well as providing some examples of how it has been used.
Availability and implementation: The framework is openly accessible at https://competency.ebi.ac.uk/framework/iscb/3.0/competencies.
{"title":"The ISCB competency framework v. 3: a revised and extended standard for bioinformatics education and training.","authors":"Cath Brooksbank, Michelle D Brazas, Nicola Mulder, Russell Schwartz, Verena Ras, Sarah L Morgan, Marta Lloret Llinares, Patricia Carvajal López, Lee Larcombe, Amel Ghouila, Tom Hancocks, Venkata Satagopam, Javier De Las Rivas, Gaston Mazandu, Bruno Gaeta","doi":"10.1093/bioadv/vbae166","DOIUrl":"10.1093/bioadv/vbae166","url":null,"abstract":"<p><strong>Motivation: </strong>Developing competency in the broad area of bioinformatics is challenging globally, owing to the breadth of the field and the diversity of its audiences for education and training. Course design can be facilitated by the use of a competency framework-a set of competency requirements that define the knowledge, skills and attitudes needed by individuals in (or aspiring to be in) a particular profession or role. These competency requirements can help to define curricula as they can inform both the content and level to which competency needs to be developed. The International Society for Computational Biology (ISCB) developed a list of bioinformatics competencies in 2014, and these have undergone several rounds of improvement. In consultation with a broad bioinformatics training community, these have now been further refined and extended to include knowledge skills and attitudes, and mappings to previous and other existing competency frameworks.</p><p><strong>Results: </strong>Here, we present version 3 of the ISCB competency framework. We describe how it was developed and how to access it, as well as providing some examples of how it has been used.</p><p><strong>Availability and implementation: </strong>The framework is openly accessible at https://competency.ebi.ac.uk/framework/iscb/3.0/competencies.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"4 1","pages":"vbae166"},"PeriodicalIF":2.4,"publicationDate":"2024-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11646570/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142831057","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-14eCollection Date: 2024-01-01DOI: 10.1093/bioadv/vbae178
Gordon Grabert, Daniel Dehncke, Tushar More, Markus List, Anke R M Kraft, Markus Cornberg, Karsten Hiller, Tim Kacprowski
Motivation: The availability of longitudinal omics data is increasing in metabolomics research. Viewing metabolomics data over time provides detailed insight into biological processes and fosters understanding of how systems react over time. However, the analysis of longitudinal metabolomics data poses various challenges, both in terms of statistical evaluation and visualization.
Results: To make explorative analysis of longitudinal data readily available to researchers without formal background in computer science and programming, we present MEtabolite Trajectory ExplORer (MeTEor). MeTEor is an R Shiny app providing a comprehensive set of statistical analysis methods. To demonstrate the capabilities of MeTEor, we replicated the analysis of metabolomics data from a previously published study on COVID-19 patients.
Availability and implementation: MeTEor is available as an R package and as a Docker image. Source code and instructions for setting up the app can be found on GitHub (https://github.com/scibiome/meteor). The Docker image is available at Docker Hub (https://hub.docker.com/r/gordomics/meteor). MeTEor has been tested on Microsoft Windows, Unix/Linux, and macOS.
{"title":"MeTEor: an R Shiny app for exploring longitudinal metabolomics data.","authors":"Gordon Grabert, Daniel Dehncke, Tushar More, Markus List, Anke R M Kraft, Markus Cornberg, Karsten Hiller, Tim Kacprowski","doi":"10.1093/bioadv/vbae178","DOIUrl":"10.1093/bioadv/vbae178","url":null,"abstract":"<p><strong>Motivation: </strong>The availability of longitudinal omics data is increasing in metabolomics research. Viewing metabolomics data over time provides detailed insight into biological processes and fosters understanding of how systems react over time. However, the analysis of longitudinal metabolomics data poses various challenges, both in terms of statistical evaluation and visualization.</p><p><strong>Results: </strong>To make explorative analysis of longitudinal data readily available to researchers without formal background in computer science and programming, we present MEtabolite Trajectory ExplORer (MeTEor). MeTEor is an R Shiny app providing a comprehensive set of statistical analysis methods. To demonstrate the capabilities of MeTEor, we replicated the analysis of metabolomics data from a previously published study on COVID-19 patients.</p><p><strong>Availability and implementation: </strong>MeTEor is available as an R package and as a Docker image. Source code and instructions for setting up the app can be found on GitHub (https://github.com/scibiome/meteor). The Docker image is available at Docker Hub (https://hub.docker.com/r/gordomics/meteor). MeTEor has been tested on Microsoft Windows, Unix/Linux, and macOS.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"4 1","pages":"vbae178"},"PeriodicalIF":2.4,"publicationDate":"2024-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11631383/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142808674","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-14eCollection Date: 2024-01-01DOI: 10.1093/bioadv/vbae175
Bianka Alexandra Pasat, Eleftherios Pilalis, Katarzyna Mnich, Afshin Samali, Aristotelis Chatziioannou, Adrienne M Gorman
Motivation: Analysis of gene and isoform expression levels is becoming critical for the detailed understanding of biochemical mechanisms. In addition, integrating RNA-seq data with other omics data types, such as proteomics and metabolomics, provides a strong approach for consolidating our understanding of biological processes across various organizational tiers, thus promoting the identification of potential therapeutic targets.
Results: We present our pipeline, called MultiOmicsIntegrator (MOI), an inclusive pipeline for comprehensive omics analyses. MOI represents a unified approach that performs in-depth individual analyses of diverse omics. Specifically, exhaustive analysis of RNA-seq data at the level of genes, isoforms of genes, as well as miRNA is offered, coupled with functional annotation and structure prediction of these transcripts. Additionally, proteomics and metabolomics data are supported providing a holistic view of biological systems. Finally, MOI has tools to integrate simultaneously multiple and diverse omics datasets, with both data- and function-driven approaches, fostering a deeper understanding of intricate biological interactions.
Availability and implementation: MOI and ReadTheDocs.
{"title":"MultiOmicsIntegrator: a nextflow pipeline for integrated omics analyses.","authors":"Bianka Alexandra Pasat, Eleftherios Pilalis, Katarzyna Mnich, Afshin Samali, Aristotelis Chatziioannou, Adrienne M Gorman","doi":"10.1093/bioadv/vbae175","DOIUrl":"10.1093/bioadv/vbae175","url":null,"abstract":"<p><strong>Motivation: </strong>Analysis of gene and isoform expression levels is becoming critical for the detailed understanding of biochemical mechanisms. In addition, integrating RNA-seq data with other omics data types, such as proteomics and metabolomics, provides a strong approach for consolidating our understanding of biological processes across various organizational tiers, thus promoting the identification of potential therapeutic targets.</p><p><strong>Results: </strong>We present our pipeline, called MultiOmicsIntegrator (MOI), an inclusive pipeline for comprehensive omics analyses. MOI represents a unified approach that performs in-depth individual analyses of diverse omics. Specifically, exhaustive analysis of RNA-seq data at the level of genes, isoforms of genes, as well as miRNA is offered, coupled with functional annotation and structure prediction of these transcripts. Additionally, proteomics and metabolomics data are supported providing a holistic view of biological systems. Finally, MOI has tools to integrate simultaneously multiple and diverse omics datasets, with both data- and function-driven approaches, fostering a deeper understanding of intricate biological interactions.</p><p><strong>Availability and implementation: </strong>MOI and ReadTheDocs.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"4 1","pages":"vbae175"},"PeriodicalIF":2.4,"publicationDate":"2024-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11576358/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142683388","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-13eCollection Date: 2024-01-01DOI: 10.1093/bioadv/vbae155
Julia Wrobel, Alex C Soupir, Mitchell T Hayes, Lauren C Peres, Thao Vu, Andrew Leroux, Brooke L Fridley
Summary: Technologies that produce spatial single-cell (SC) data have revolutionized the study of tissue microstructures and promise to advance personalized treatment of cancer by revealing new insights about the tumor microenvironment. Functional data analysis (FDA) is an ideal analytic framework for connecting cell spatial relationships to patient outcomes, but can be challenging to implement. To address this need, we present mxfda, an R package for end-to-end analysis of SC spatial data using FDA. mxfda implements a suite of methods to facilitate spatial analysis of SC imaging data using FDA techniques.
Availability and implementation: The mxfda R package is freely available at https://cran.r-project.org/package=mxfda and has detailed documentation, including four vignettes, available at http://juliawrobel.com/mxfda/.
摘要:产生空间单细胞(SC)数据的技术彻底改变了对组织微结构的研究,并有望通过揭示肿瘤微环境的新见解推进癌症的个性化治疗。功能数据分析(FDA)是将细胞空间关系与患者预后联系起来的理想分析框架,但实施起来却很困难。为了满足这一需求,我们推出了 mxfda,这是一个使用 FDA 对 SC 空间数据进行端到端分析的 R 软件包。mxfda 实现了一套方法,便于使用 FDA 技术对 SC 成像数据进行空间分析:mxfda R 软件包可从 https://cran.r-project.org/package=mxfda 免费获取,其详细文档(包括四个小节)可从 http://juliawrobel.com/mxfda/ 获取。
{"title":"mxfda: a comprehensive toolkit for functional data analysis of single-cell spatial data.","authors":"Julia Wrobel, Alex C Soupir, Mitchell T Hayes, Lauren C Peres, Thao Vu, Andrew Leroux, Brooke L Fridley","doi":"10.1093/bioadv/vbae155","DOIUrl":"10.1093/bioadv/vbae155","url":null,"abstract":"<p><strong>Summary: </strong>Technologies that produce spatial single-cell (SC) data have revolutionized the study of tissue microstructures and promise to advance personalized treatment of cancer by revealing new insights about the tumor microenvironment. Functional data analysis (FDA) is an ideal analytic framework for connecting cell spatial relationships to patient outcomes, but can be challenging to implement. To address this need, we present mxfda, an R package for end-to-end analysis of SC spatial data using FDA. mxfda implements a suite of methods to facilitate spatial analysis of SC imaging data using FDA techniques.</p><p><strong>Availability and implementation: </strong>The mxfda R package is freely available at https://cran.r-project.org/package=mxfda and has detailed documentation, including four vignettes, available at http://juliawrobel.com/mxfda/.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"4 1","pages":"vbae155"},"PeriodicalIF":2.4,"publicationDate":"2024-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11568348/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142649354","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-13eCollection Date: 2024-01-01DOI: 10.1093/bioadv/vbae174
Timothy Páez-Watson, Ricardo Hernández Medina, Loek Vellekoop, Mark C M van Loosdrecht, S Aljoscha Wahl
Summary: We present py_cFBA, a Python-based toolbox for conditional flux balance analysis (cFBA). Our toolbox allows for an easy implementation of cFBA models using a well-documented and modular approach and supports the generation of Systems Biology Markup Language models. The toolbox is designed to be user-friendly, versatile, and freely available to non-commercial users, serving as a valuable resource for researchers predicting metabolic behaviour with resource allocation in dynamic-cyclic environments.
Availability and implementation: Extensive documentation, installation steps, tutorials, and examples are available at https://tp-watson-python-cfba.readthedocs.io/en/. The py_cFBA python package is available at https://pypi.org/project/py-cfba/.
{"title":"Conditional flux balance analysis toolbox for python: application to research metabolism in cyclic environments.","authors":"Timothy Páez-Watson, Ricardo Hernández Medina, Loek Vellekoop, Mark C M van Loosdrecht, S Aljoscha Wahl","doi":"10.1093/bioadv/vbae174","DOIUrl":"10.1093/bioadv/vbae174","url":null,"abstract":"<p><strong>Summary: </strong>We present py_cFBA, a Python-based toolbox for conditional flux balance analysis (cFBA). Our toolbox allows for an easy implementation of cFBA models using a well-documented and modular approach and supports the generation of Systems Biology Markup Language models. The toolbox is designed to be user-friendly, versatile, and freely available to non-commercial users, serving as a valuable resource for researchers predicting metabolic behaviour with resource allocation in dynamic-cyclic environments.</p><p><strong>Availability and implementation: </strong>Extensive documentation, installation steps, tutorials, and examples are available at https://tp-watson-python-cfba.readthedocs.io/en/. The py_cFBA python package is available at https://pypi.org/project/py-cfba/.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"4 1","pages":"vbae174"},"PeriodicalIF":2.4,"publicationDate":"2024-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11593493/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142735127","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}