首页 > 最新文献

Bioinformatics advances最新文献

英文 中文
The combined focal loss and dice loss function improves the segmentation of beta-sheets in medium-resolution cryo-electron-microscopy density maps. 结合焦点损失和骰子损失函数可改善中分辨率冷冻电子显微镜密度图中贝塔片的分割。
IF 2.4 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-11-22 eCollection Date: 2024-01-01 DOI: 10.1093/bioadv/vbae169
Yongcheng Mu, Thu Nguyen, Bryan Hawickhorst, Willy Wriggers, Jiangwen Sun, Jing He

Summary: Although multiple neural networks have been proposed for detecting secondary structures from medium-resolution (5-10 Å) cryo-electron microscopy (cryo-EM) maps, the loss functions used in the existing deep learning networks are primarily based on cross-entropy loss, which is known to be sensitive to class imbalances. We investigated five loss functions: cross-entropy, Focal loss, Dice loss, and two combined loss functions. Using a U-Net architecture in our DeepSSETracer method and a dataset composed of 1355 box-cropped atomic-structure/density-map pairs, we found that a newly designed loss function that combines Focal loss and Dice loss provides the best overall detection accuracy for secondary structures. For β-sheet voxels, which are generally much harder to detect than helix voxels, the combined loss function achieved a significant improvement (an 8.8% increase in the F1 score) compared to the cross-entropy loss function and a noticeable improvement from the Dice loss function. This study demonstrates the potential for designing more effective loss functions for hard cases in the segmentation of secondary structures. The newly trained model was incorporated into DeepSSETracer 1.1 for the segmentation of protein secondary structures in medium-resolution cryo-EM map components. DeepSSETracer can be integrated into ChimeraX, a popular molecular visualization software.

Availability and implementation: https://www.cs.odu.edu/∼bioinfo/B2I_Tools/.

摘要:虽然已经提出了多种神经网络来检测中等分辨率(5-10 Å)冷冻电镜(cryo-EM)图中的二级结构,但现有深度学习网络中使用的损失函数主要基于交叉熵损失,而已知交叉熵损失对类不平衡很敏感。我们研究了五种损失函数:交叉熵损失、焦点损失、骰子损失和两种组合损失函数。我们在 DeepSSETracer 方法中使用了 U-Net 架构,并使用了由 1355 个盒式裁剪的原子结构/密度图对组成的数据集,发现新设计的损失函数结合了 Focal 损失和 Dice 损失,为二级结构提供了最佳的整体检测精度。对于通常比螺旋体体素更难检测的 β 片状体素,与交叉熵损失函数相比,组合损失函数取得了显著的改进(F1 分数提高了 8.8%),与 Dice 损失函数相比也有明显的改进。这项研究证明了针对二级结构分割中的困难情况设计更有效损失函数的潜力。新训练的模型被纳入 DeepSSETracer 1.1,用于分割中等分辨率冷冻电子显微镜图成分中的蛋白质二级结构。DeepSSETracer可集成到流行的分子可视化软件ChimeraX中。可用性和实现:https://www.cs.odu.edu/∼bioinfo/B2I_Tools/。
{"title":"The combined focal loss and dice loss function improves the segmentation of beta-sheets in medium-resolution cryo-electron-microscopy density maps.","authors":"Yongcheng Mu, Thu Nguyen, Bryan Hawickhorst, Willy Wriggers, Jiangwen Sun, Jing He","doi":"10.1093/bioadv/vbae169","DOIUrl":"10.1093/bioadv/vbae169","url":null,"abstract":"<p><strong>Summary: </strong>Although multiple neural networks have been proposed for detecting secondary structures from medium-resolution (5-10 Å) cryo-electron microscopy (cryo-EM) maps, the loss functions used in the existing deep learning networks are primarily based on cross-entropy loss, which is known to be sensitive to class imbalances. We investigated five loss functions: cross-entropy, Focal loss, Dice loss, and two combined loss functions. Using a U-Net architecture in our DeepSSETracer method and a dataset composed of 1355 box-cropped atomic-structure/density-map pairs, we found that a newly designed loss function that combines Focal loss and Dice loss provides the best overall detection accuracy for secondary structures. For β-sheet voxels, which are generally much harder to detect than helix voxels, the combined loss function achieved a significant improvement (an 8.8% increase in the F<sub>1</sub> score) compared to the cross-entropy loss function and a noticeable improvement from the Dice loss function. This study demonstrates the potential for designing more effective loss functions for hard cases in the segmentation of secondary structures. The newly trained model was incorporated into DeepSSETracer 1.1 for the segmentation of protein secondary structures in medium-resolution cryo-EM map components. DeepSSETracer can be integrated into ChimeraX, a popular molecular visualization software.</p><p><strong>Availability and implementation: </strong>https://www.cs.odu.edu/∼bioinfo/B2I_Tools/.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"4 1","pages":"vbae169"},"PeriodicalIF":2.4,"publicationDate":"2024-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11590252/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142735054","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
blast2galaxy: a CLI and Python API for BLAST+ and DIAMOND searches on Galaxy servers.
IF 2.4 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-11-22 eCollection Date: 2024-01-01 DOI: 10.1093/bioadv/vbae185
Patrick König, Anne Fiebig, Thomas Münch, Björn Grüning, Uwe Scholz

Motivation: The Galaxy workflow system is an open-source platform supporting data-intensive research in life sciences, featuring a user-friendly web interface for complex analyses without extensive programming. It also offers a representational state transfer based API, enabling remote execution of specific tools. Galaxy supports similarity searches for nucleotide and amino acid sequences, with integrated tools like NCBI BLAST+ and DIAMOND. However, no specialized software currently exists for convenient use of NCBI BLAST+ and DIAMOND via the Galaxy API.

Results: blast2galaxy is a Python package that uses the Galaxy API to run sequence alignments with NCBI BLAST+ and DIAMOND as Galaxy-wrapped tools on compatible servers. It includes a command-line interface that mirrors the CLI of BLAST+ and DIAMOND and a high-level Python API for direct alignments from Python applications. The package relies on bioblend for communication with the Galaxy API.

Availability and implementation: blast2galaxy is available as open-source software under the MIT license. The source code is available on Github: https://github.com/IPK-BIT/blast2galaxy. It can be installed from the Python Package Index using "pip install blast2galaxy" or from the Bioconda channel using "conda install -c bioconda blast2galaxy". Docker and Apptainer images are available and referenced in the documentation which is available under https://blast2galaxy.readthedocs.io.

{"title":"blast2galaxy: a CLI and Python API for BLAST+ and DIAMOND searches on Galaxy servers.","authors":"Patrick König, Anne Fiebig, Thomas Münch, Björn Grüning, Uwe Scholz","doi":"10.1093/bioadv/vbae185","DOIUrl":"10.1093/bioadv/vbae185","url":null,"abstract":"<p><strong>Motivation: </strong>The Galaxy workflow system is an open-source platform supporting data-intensive research in life sciences, featuring a user-friendly web interface for complex analyses without extensive programming. It also offers a representational state transfer based API, enabling remote execution of specific tools. Galaxy supports similarity searches for nucleotide and amino acid sequences, with integrated tools like NCBI BLAST+ and DIAMOND. However, no specialized software currently exists for convenient use of NCBI BLAST+ and DIAMOND via the Galaxy API.</p><p><strong>Results: </strong>blast2galaxy is a Python package that uses the Galaxy API to run sequence alignments with NCBI BLAST+ and DIAMOND as Galaxy-wrapped tools on compatible servers. It includes a command-line interface that mirrors the CLI of BLAST+ and DIAMOND and a high-level Python API for direct alignments from Python applications. The package relies on bioblend for communication with the Galaxy API.</p><p><strong>Availability and implementation: </strong>blast2galaxy is available as open-source software under the MIT license. The source code is available on Github: https://github.com/IPK-BIT/blast2galaxy. It can be installed from the Python Package Index using \"pip install blast2galaxy\" or from the Bioconda channel using \"conda install -c bioconda blast2galaxy\". Docker and Apptainer images are available and referenced in the documentation which is available under https://blast2galaxy.readthedocs.io.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"4 1","pages":"vbae185"},"PeriodicalIF":2.4,"publicationDate":"2024-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11629687/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142808680","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
LncLSTA: a versatile predictor unveiling subcellular localization of lncRNAs through long-short term attention.
IF 2.4 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-11-22 eCollection Date: 2025-01-01 DOI: 10.1093/bioadv/vbae173
Kai Wang, Yueming Hu, Sida Li, Ming Chen, Zhong Li

Motivation: Much evidence suggests that the subcellular localization of long-stranded noncoding RNAs (LncRNAs) provides key insights for the study of their biological function.

Results: This study proposes a novel deep learning framework, LncLSTA, designed for predicting the subcellular localization of LncRNAs. It firstly exploits LncRNA sequence, electron-ion interaction pseudopotentials, and nucleotide chemical property as feature inputs. Departing from conventional k-mer approaches, this model uses a set of 1D convolutional and maxpooling operations for dynamical feature aggregation. Furthermore, LncLSTA integrates a long-short term attention module with a bidirectional long and short term memory network to comprehensively extract sequence information. In addition, it incorporates a TextCNN module to enhance accuracy and robustness in subcellular localization tasks. Experimental results demonstrate the efficacy of LncLSTA, showcasing its superior performance compared to other state-of-the-art methods. Notably, LncLSTA exhibits the transfer learning capability, extending its utility to predict the subcellular localization prediction of mRNAs, while maintaining consistently satisfactory prediction results. This research contributes valuable insights into understanding the biological functions of LncRNAs through subcellular localization, emphasizing the potential of deep learning approaches in advancing RNA-related studies.

Availability and implementation: The source code is publicly available at https://bis.zju.edu.cn/LncLSTA.

{"title":"LncLSTA: a versatile predictor unveiling subcellular localization of lncRNAs through long-short term attention.","authors":"Kai Wang, Yueming Hu, Sida Li, Ming Chen, Zhong Li","doi":"10.1093/bioadv/vbae173","DOIUrl":"https://doi.org/10.1093/bioadv/vbae173","url":null,"abstract":"<p><strong>Motivation: </strong>Much evidence suggests that the subcellular localization of long-stranded noncoding RNAs (LncRNAs) provides key insights for the study of their biological function.</p><p><strong>Results: </strong>This study proposes a novel deep learning framework, LncLSTA, designed for predicting the subcellular localization of LncRNAs. It firstly exploits LncRNA sequence, electron-ion interaction pseudopotentials, and nucleotide chemical property as feature inputs. Departing from conventional <i>k</i>-mer approaches, this model uses a set of 1D convolutional and maxpooling operations for dynamical feature aggregation. Furthermore, LncLSTA integrates a long-short term attention module with a bidirectional long and short term memory network to comprehensively extract sequence information. In addition, it incorporates a TextCNN module to enhance accuracy and robustness in subcellular localization tasks. Experimental results demonstrate the efficacy of LncLSTA, showcasing its superior performance compared to other state-of-the-art methods. Notably, LncLSTA exhibits the transfer learning capability, extending its utility to predict the subcellular localization prediction of mRNAs, while maintaining consistently satisfactory prediction results. This research contributes valuable insights into understanding the biological functions of LncRNAs through subcellular localization, emphasizing the potential of deep learning approaches in advancing RNA-related studies.</p><p><strong>Availability and implementation: </strong>The source code is publicly available at https://bis.zju.edu.cn/LncLSTA.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"5 1","pages":"vbae173"},"PeriodicalIF":2.4,"publicationDate":"2024-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11700581/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142933930","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SeqLengthPlot v2.0: an all-in-one, easy-to-use tool for visualizing and retrieving sequence lengths from FASTA files. SeqLengthPlot v2.0:一款一体化的易用工具,用于从 FASTA 文件中可视化和检索序列长度。
IF 2.4 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-11-20 eCollection Date: 2025-01-01 DOI: 10.1093/bioadv/vbae183
Dany Domínguez-Pérez, Guillermin Agüero-Chapin, Serena Leone, Maria Vittoria Modica

Motivation: Accurate sequence length profiling is essential in bioinformatics, particularly in genomics and proteomics. Existing tools like SeqKit and the Trinity toolkit provide basic sequence statistics but often fall short in offering comprehensive analytics and plotting options. For instance, SeqKit is a very complete and fast tool for sequence analysis, delivering useful metrics (e.g. number of sequences, average, minimum, and maximum lengths) and can return sequences either shorter or longer (but not both at once) for a given length. Similarly, Trinity's Perl-based scripts provide detailed contig length distributions (e.g. N50, median, and average lengths) but do not include the total number of sequences or offer graphical representations of the data.

Results: Given that key sequence analysis tasks are often distributed across multiple tools, we introduce SeqLengthPlot v2.0, an all-in-one, easy-to-use Python-based tool. Through a simple command-line interface, this straightforward tool enables users to split input FASTA files (nucleotide and protein) into two distinct files based on a customizable sequence length cutoff. It also automatically retrieves the resulting FASTA files, generates length distribution plots, and provides comprehensive statistical summaries.

Availability and implementation: SeqLengthPlot_v2.0.2 can be accessed at https://github.com/danydguezperez/SeqLengthPlot/releases/tag/v2.0.2.

{"title":"SeqLengthPlot v2.0: an all-in-one, easy-to-use tool for visualizing and retrieving sequence lengths from FASTA files.","authors":"Dany Domínguez-Pérez, Guillermin Agüero-Chapin, Serena Leone, Maria Vittoria Modica","doi":"10.1093/bioadv/vbae183","DOIUrl":"10.1093/bioadv/vbae183","url":null,"abstract":"<p><strong>Motivation: </strong>Accurate sequence length profiling is essential in bioinformatics, particularly in genomics and proteomics. Existing tools like SeqKit and the Trinity toolkit provide basic sequence statistics but often fall short in offering comprehensive analytics and plotting options. For instance, SeqKit is a very complete and fast tool for sequence analysis, delivering useful metrics (e.g. number of sequences, average, minimum, and maximum lengths) and can return sequences either shorter or longer (but not both at once) for a given length. Similarly, Trinity's Perl-based scripts provide detailed contig length distributions (e.g. N50, median, and average lengths) but do not include the total number of sequences or offer graphical representations of the data.</p><p><strong>Results: </strong>Given that key sequence analysis tasks are often distributed across multiple tools, we introduce <b>SeqLengthPlot v2.0</b>, an all-in-one, easy-to-use Python-based tool. Through a simple command-line interface, this straightforward tool enables users to split input FASTA files (nucleotide and protein) into two distinct files based on a customizable sequence length cutoff. It also automatically retrieves the resulting FASTA files, generates length distribution plots, and provides comprehensive statistical summaries.</p><p><strong>Availability and implementation: </strong>SeqLengthPlot_v2.0.2 can be accessed at https://github.com/danydguezperez/SeqLengthPlot/releases/tag/v2.0.2.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"5 1","pages":"vbae183"},"PeriodicalIF":2.4,"publicationDate":"2024-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11671033/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142904226","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
KTED: a comprehensive web-based database for transposable elements in the Korean genome.
IF 2.4 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-11-19 eCollection Date: 2024-01-01 DOI: 10.1093/bioadv/vbae179
Jin-Ok Lee, Sejoon Lee, Dongyoon Lee, Taeyeon Hwang, Soobok Joe, Jin Ok Yang, Jibin Jeong, Jung Hun Ohn, Jee Hyun Kim

Summary: Transposable elements (TEs), commonly referred to as "mobile elements," constitute DNA segments capable of relocating within a genome. Initially disregarded as "junk DNA" devoid of specific functionality, it has become evident that TEs have diverse influences on an organism's biology and health. The impact of these elements varies according to their location, classification, and their effects on specific genes or regulatory components. Despite their significant roles, a paucity of resources concerning TEs in population-scale genome sequencing remains. Herein, we analyze whole-genome sequencing data sourced from the Korean Genome and Epidemiology Study, encompassing 2500 Korean individuals. To facilitate convenient data access and observation, we developed a web-based database, KTED. Additionally, we scrutinized the differential distributions of TEs across five distinct common disease groups: dyslipidemia, hypertension, diabetes, thyroid disease, and cancer.

Availability and implementation: https://snubh.shinyapps.io/KTED.

{"title":"KTED: a comprehensive web-based database for transposable elements in the Korean genome.","authors":"Jin-Ok Lee, Sejoon Lee, Dongyoon Lee, Taeyeon Hwang, Soobok Joe, Jin Ok Yang, Jibin Jeong, Jung Hun Ohn, Jee Hyun Kim","doi":"10.1093/bioadv/vbae179","DOIUrl":"10.1093/bioadv/vbae179","url":null,"abstract":"<p><strong>Summary: </strong>Transposable elements (TEs), commonly referred to as \"mobile elements,\" constitute DNA segments capable of relocating within a genome. Initially disregarded as \"junk DNA\" devoid of specific functionality, it has become evident that TEs have diverse influences on an organism's biology and health. The impact of these elements varies according to their location, classification, and their effects on specific genes or regulatory components. Despite their significant roles, a paucity of resources concerning TEs in population-scale genome sequencing remains. Herein, we analyze whole-genome sequencing data sourced from the Korean Genome and Epidemiology Study, encompassing 2500 Korean individuals. To facilitate convenient data access and observation, we developed a web-based database, KTED. Additionally, we scrutinized the differential distributions of TEs across five distinct common disease groups: dyslipidemia, hypertension, diabetes, thyroid disease, and cancer.</p><p><strong>Availability and implementation: </strong>https://snubh.shinyapps.io/KTED.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"4 1","pages":"vbae179"},"PeriodicalIF":2.4,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11652267/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142857185","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The ISCB competency framework v. 3: a revised and extended standard for bioinformatics education and training.
IF 2.4 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-11-18 eCollection Date: 2024-01-01 DOI: 10.1093/bioadv/vbae166
Cath Brooksbank, Michelle D Brazas, Nicola Mulder, Russell Schwartz, Verena Ras, Sarah L Morgan, Marta Lloret Llinares, Patricia Carvajal López, Lee Larcombe, Amel Ghouila, Tom Hancocks, Venkata Satagopam, Javier De Las Rivas, Gaston Mazandu, Bruno Gaeta

Motivation: Developing competency in the broad area of bioinformatics is challenging globally, owing to the breadth of the field and the diversity of its audiences for education and training. Course design can be facilitated by the use of a competency framework-a set of competency requirements that define the knowledge, skills and attitudes needed by individuals in (or aspiring to be in) a particular profession or role. These competency requirements can help to define curricula as they can inform both the content and level to which competency needs to be developed. The International Society for Computational Biology (ISCB) developed a list of bioinformatics competencies in 2014, and these have undergone several rounds of improvement. In consultation with a broad bioinformatics training community, these have now been further refined and extended to include knowledge skills and attitudes, and mappings to previous and other existing competency frameworks.

Results: Here, we present version 3 of the ISCB competency framework. We describe how it was developed and how to access it, as well as providing some examples of how it has been used.

Availability and implementation: The framework is openly accessible at https://competency.ebi.ac.uk/framework/iscb/3.0/competencies.

动机:由于生物信息学领域的广泛性及其教育和培训对象的多样性,在全球范围内培养生物信息学领域的能力具有挑战性。能力框架是一套能力要求,它规定了从事(或希望从事)特定职业或角色的个人所需的知识、技能和态度。这些能力要求可以帮助确定课程,因为它们可以告知需要培养的能力的内容和水平。国际计算生物学会(ISCB)于2014年制定了生物信息学能力清单,并对其进行了多轮改进。经过与生物信息学培训界的广泛磋商,这些能力现在得到了进一步完善和扩展,包括知识技能和态度,以及与以前和其他现有能力框架的映射:结果:在此,我们介绍 ISCB 能力框架的第三版。我们介绍了该框架的开发过程和使用方法,并提供了一些使用实例:该框架可通过 https://competency.ebi.ac.uk/framework/iscb/3.0/competencies 公开获取。
{"title":"The ISCB competency framework v. 3: a revised and extended standard for bioinformatics education and training.","authors":"Cath Brooksbank, Michelle D Brazas, Nicola Mulder, Russell Schwartz, Verena Ras, Sarah L Morgan, Marta Lloret Llinares, Patricia Carvajal López, Lee Larcombe, Amel Ghouila, Tom Hancocks, Venkata Satagopam, Javier De Las Rivas, Gaston Mazandu, Bruno Gaeta","doi":"10.1093/bioadv/vbae166","DOIUrl":"10.1093/bioadv/vbae166","url":null,"abstract":"<p><strong>Motivation: </strong>Developing competency in the broad area of bioinformatics is challenging globally, owing to the breadth of the field and the diversity of its audiences for education and training. Course design can be facilitated by the use of a competency framework-a set of competency requirements that define the knowledge, skills and attitudes needed by individuals in (or aspiring to be in) a particular profession or role. These competency requirements can help to define curricula as they can inform both the content and level to which competency needs to be developed. The International Society for Computational Biology (ISCB) developed a list of bioinformatics competencies in 2014, and these have undergone several rounds of improvement. In consultation with a broad bioinformatics training community, these have now been further refined and extended to include knowledge skills and attitudes, and mappings to previous and other existing competency frameworks.</p><p><strong>Results: </strong>Here, we present version 3 of the ISCB competency framework. We describe how it was developed and how to access it, as well as providing some examples of how it has been used.</p><p><strong>Availability and implementation: </strong>The framework is openly accessible at https://competency.ebi.ac.uk/framework/iscb/3.0/competencies.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"4 1","pages":"vbae166"},"PeriodicalIF":2.4,"publicationDate":"2024-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11646570/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142831057","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MeTEor: an R Shiny app for exploring longitudinal metabolomics data.
IF 2.4 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-11-14 eCollection Date: 2024-01-01 DOI: 10.1093/bioadv/vbae178
Gordon Grabert, Daniel Dehncke, Tushar More, Markus List, Anke R M Kraft, Markus Cornberg, Karsten Hiller, Tim Kacprowski

Motivation: The availability of longitudinal omics data is increasing in metabolomics research. Viewing metabolomics data over time provides detailed insight into biological processes and fosters understanding of how systems react over time. However, the analysis of longitudinal metabolomics data poses various challenges, both in terms of statistical evaluation and visualization.

Results: To make explorative analysis of longitudinal data readily available to researchers without formal background in computer science and programming, we present MEtabolite Trajectory ExplORer (MeTEor). MeTEor is an R Shiny app providing a comprehensive set of statistical analysis methods. To demonstrate the capabilities of MeTEor, we replicated the analysis of metabolomics data from a previously published study on COVID-19 patients.

Availability and implementation: MeTEor is available as an R package and as a Docker image. Source code and instructions for setting up the app can be found on GitHub (https://github.com/scibiome/meteor). The Docker image is available at Docker Hub (https://hub.docker.com/r/gordomics/meteor). MeTEor has been tested on Microsoft Windows, Unix/Linux, and macOS.

{"title":"MeTEor: an R Shiny app for exploring longitudinal metabolomics data.","authors":"Gordon Grabert, Daniel Dehncke, Tushar More, Markus List, Anke R M Kraft, Markus Cornberg, Karsten Hiller, Tim Kacprowski","doi":"10.1093/bioadv/vbae178","DOIUrl":"10.1093/bioadv/vbae178","url":null,"abstract":"<p><strong>Motivation: </strong>The availability of longitudinal omics data is increasing in metabolomics research. Viewing metabolomics data over time provides detailed insight into biological processes and fosters understanding of how systems react over time. However, the analysis of longitudinal metabolomics data poses various challenges, both in terms of statistical evaluation and visualization.</p><p><strong>Results: </strong>To make explorative analysis of longitudinal data readily available to researchers without formal background in computer science and programming, we present MEtabolite Trajectory ExplORer (MeTEor). MeTEor is an R Shiny app providing a comprehensive set of statistical analysis methods. To demonstrate the capabilities of MeTEor, we replicated the analysis of metabolomics data from a previously published study on COVID-19 patients.</p><p><strong>Availability and implementation: </strong>MeTEor is available as an R package and as a Docker image. Source code and instructions for setting up the app can be found on GitHub (https://github.com/scibiome/meteor). The Docker image is available at Docker Hub (https://hub.docker.com/r/gordomics/meteor). MeTEor has been tested on Microsoft Windows, Unix/Linux, and macOS.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"4 1","pages":"vbae178"},"PeriodicalIF":2.4,"publicationDate":"2024-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11631383/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142808674","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MultiOmicsIntegrator: a nextflow pipeline for integrated omics analyses. MultiOmicsIntegrator:一个用于综合 omics 分析的 nextflow 管道。
IF 2.4 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-11-14 eCollection Date: 2024-01-01 DOI: 10.1093/bioadv/vbae175
Bianka Alexandra Pasat, Eleftherios Pilalis, Katarzyna Mnich, Afshin Samali, Aristotelis Chatziioannou, Adrienne M Gorman

Motivation: Analysis of gene and isoform expression levels is becoming critical for the detailed understanding of biochemical mechanisms. In addition, integrating RNA-seq data with other omics data types, such as proteomics and metabolomics, provides a strong approach for consolidating our understanding of biological processes across various organizational tiers, thus promoting the identification of potential therapeutic targets.

Results: We present our pipeline, called MultiOmicsIntegrator (MOI), an inclusive pipeline for comprehensive omics analyses. MOI represents a unified approach that performs in-depth individual analyses of diverse omics. Specifically, exhaustive analysis of RNA-seq data at the level of genes, isoforms of genes, as well as miRNA is offered, coupled with functional annotation and structure prediction of these transcripts. Additionally, proteomics and metabolomics data are supported providing a holistic view of biological systems. Finally, MOI has tools to integrate simultaneously multiple and diverse omics datasets, with both data- and function-driven approaches, fostering a deeper understanding of intricate biological interactions.

Availability and implementation: MOI and ReadTheDocs.

动机基因和同工酶表达水平的分析对于详细了解生化机制至关重要。此外,RNA-seq 数据与蛋白质组学和代谢组学等其他全息数据类型的整合,为巩固我们对不同组织层级的生物过程的理解提供了强有力的方法,从而促进了潜在治疗靶点的确定:我们介绍了名为 "MultiOmicsIntegrator (MOI) "的管道,它是一种用于综合全局组学分析的包容性管道。MOI 是一种统一的方法,可对不同的 omics 进行深入的单独分析。具体来说,MOI 可在基因、基因同工酶和 miRNA 水平上对 RNA-seq 数据进行详尽分析,并对这些转录本进行功能注释和结构预测。此外,还支持蛋白质组学和代谢组学数据,提供生物系统的整体视图。最后,MOI 还提供了同时整合多个不同的 omics 数据集的工具,采用数据和功能驱动的方法,加深对错综复杂的生物相互作用的理解:MOI 和 ReadTheDocs。
{"title":"MultiOmicsIntegrator: a nextflow pipeline for integrated omics analyses.","authors":"Bianka Alexandra Pasat, Eleftherios Pilalis, Katarzyna Mnich, Afshin Samali, Aristotelis Chatziioannou, Adrienne M Gorman","doi":"10.1093/bioadv/vbae175","DOIUrl":"10.1093/bioadv/vbae175","url":null,"abstract":"<p><strong>Motivation: </strong>Analysis of gene and isoform expression levels is becoming critical for the detailed understanding of biochemical mechanisms. In addition, integrating RNA-seq data with other omics data types, such as proteomics and metabolomics, provides a strong approach for consolidating our understanding of biological processes across various organizational tiers, thus promoting the identification of potential therapeutic targets.</p><p><strong>Results: </strong>We present our pipeline, called MultiOmicsIntegrator (MOI), an inclusive pipeline for comprehensive omics analyses. MOI represents a unified approach that performs in-depth individual analyses of diverse omics. Specifically, exhaustive analysis of RNA-seq data at the level of genes, isoforms of genes, as well as miRNA is offered, coupled with functional annotation and structure prediction of these transcripts. Additionally, proteomics and metabolomics data are supported providing a holistic view of biological systems. Finally, MOI has tools to integrate simultaneously multiple and diverse omics datasets, with both data- and function-driven approaches, fostering a deeper understanding of intricate biological interactions.</p><p><strong>Availability and implementation: </strong>MOI and ReadTheDocs.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"4 1","pages":"vbae175"},"PeriodicalIF":2.4,"publicationDate":"2024-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11576358/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142683388","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
mxfda: a comprehensive toolkit for functional data analysis of single-cell spatial data. mxfda:用于单细胞空间数据功能数据分析的综合工具包。
IF 2.4 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-11-13 eCollection Date: 2024-01-01 DOI: 10.1093/bioadv/vbae155
Julia Wrobel, Alex C Soupir, Mitchell T Hayes, Lauren C Peres, Thao Vu, Andrew Leroux, Brooke L Fridley

Summary: Technologies that produce spatial single-cell (SC) data have revolutionized the study of tissue microstructures and promise to advance personalized treatment of cancer by revealing new insights about the tumor microenvironment. Functional data analysis (FDA) is an ideal analytic framework for connecting cell spatial relationships to patient outcomes, but can be challenging to implement. To address this need, we present mxfda, an R package for end-to-end analysis of SC spatial data using FDA. mxfda implements a suite of methods to facilitate spatial analysis of SC imaging data using FDA techniques.

Availability and implementation: The mxfda R package is freely available at https://cran.r-project.org/package=mxfda and has detailed documentation, including four vignettes, available at http://juliawrobel.com/mxfda/.

摘要:产生空间单细胞(SC)数据的技术彻底改变了对组织微结构的研究,并有望通过揭示肿瘤微环境的新见解推进癌症的个性化治疗。功能数据分析(FDA)是将细胞空间关系与患者预后联系起来的理想分析框架,但实施起来却很困难。为了满足这一需求,我们推出了 mxfda,这是一个使用 FDA 对 SC 空间数据进行端到端分析的 R 软件包。mxfda 实现了一套方法,便于使用 FDA 技术对 SC 成像数据进行空间分析:mxfda R 软件包可从 https://cran.r-project.org/package=mxfda 免费获取,其详细文档(包括四个小节)可从 http://juliawrobel.com/mxfda/ 获取。
{"title":"mxfda: a comprehensive toolkit for functional data analysis of single-cell spatial data.","authors":"Julia Wrobel, Alex C Soupir, Mitchell T Hayes, Lauren C Peres, Thao Vu, Andrew Leroux, Brooke L Fridley","doi":"10.1093/bioadv/vbae155","DOIUrl":"10.1093/bioadv/vbae155","url":null,"abstract":"<p><strong>Summary: </strong>Technologies that produce spatial single-cell (SC) data have revolutionized the study of tissue microstructures and promise to advance personalized treatment of cancer by revealing new insights about the tumor microenvironment. Functional data analysis (FDA) is an ideal analytic framework for connecting cell spatial relationships to patient outcomes, but can be challenging to implement. To address this need, we present mxfda, an R package for end-to-end analysis of SC spatial data using FDA. mxfda implements a suite of methods to facilitate spatial analysis of SC imaging data using FDA techniques.</p><p><strong>Availability and implementation: </strong>The mxfda R package is freely available at https://cran.r-project.org/package=mxfda and has detailed documentation, including four vignettes, available at http://juliawrobel.com/mxfda/.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"4 1","pages":"vbae155"},"PeriodicalIF":2.4,"publicationDate":"2024-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11568348/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142649354","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Conditional flux balance analysis toolbox for python: application to research metabolism in cyclic environments. python 条件通量平衡分析工具箱:循环环境中新陈代谢研究的应用。
IF 2.4 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2024-11-13 eCollection Date: 2024-01-01 DOI: 10.1093/bioadv/vbae174
Timothy Páez-Watson, Ricardo Hernández Medina, Loek Vellekoop, Mark C M van Loosdrecht, S Aljoscha Wahl

Summary: We present py_cFBA, a Python-based toolbox for conditional flux balance analysis (cFBA). Our toolbox allows for an easy implementation of cFBA models using a well-documented and modular approach and supports the generation of Systems Biology Markup Language models. The toolbox is designed to be user-friendly, versatile, and freely available to non-commercial users, serving as a valuable resource for researchers predicting metabolic behaviour with resource allocation in dynamic-cyclic environments.

Availability and implementation: Extensive documentation, installation steps, tutorials, and examples are available at https://tp-watson-python-cfba.readthedocs.io/en/. The py_cFBA python package is available at https://pypi.org/project/py-cfba/.

摘要:我们介绍了基于 Python 的条件通量平衡分析(cFBA)工具箱 py_cFBA。我们的工具箱采用文档齐全的模块化方法,可轻松实现条件通量平衡分析模型,并支持生成系统生物学标记语言模型。该工具箱设计为用户友好型、多功能型,可免费提供给非商业用户,是研究人员预测动态循环环境中资源分配的代谢行为的宝贵资源:广泛的文档、安装步骤、教程和示例可从 https://tp-watson-python-cfba.readthedocs.io/en/ 获取。py_cFBA python 软件包可从 https://pypi.org/project/py-cfba/ 获取。
{"title":"Conditional flux balance analysis toolbox for python: application to research metabolism in cyclic environments.","authors":"Timothy Páez-Watson, Ricardo Hernández Medina, Loek Vellekoop, Mark C M van Loosdrecht, S Aljoscha Wahl","doi":"10.1093/bioadv/vbae174","DOIUrl":"10.1093/bioadv/vbae174","url":null,"abstract":"<p><strong>Summary: </strong>We present py_cFBA, a Python-based toolbox for conditional flux balance analysis (cFBA). Our toolbox allows for an easy implementation of cFBA models using a well-documented and modular approach and supports the generation of Systems Biology Markup Language models. The toolbox is designed to be user-friendly, versatile, and freely available to non-commercial users, serving as a valuable resource for researchers predicting metabolic behaviour with resource allocation in dynamic-cyclic environments.</p><p><strong>Availability and implementation: </strong>Extensive documentation, installation steps, tutorials, and examples are available at https://tp-watson-python-cfba.readthedocs.io/en/. The py_cFBA python package is available at https://pypi.org/project/py-cfba/.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"4 1","pages":"vbae174"},"PeriodicalIF":2.4,"publicationDate":"2024-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11593493/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142735127","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Bioinformatics advances
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1