Hasan Balci, Adrien Rougny, Rupert Overall, Irina Balaur, Michael L Blinov, Hanna Borlinghaus, Emek Demir, Andreas Dräger, Robin Haw, Alexander Mazein, Huaiyu Mi, Stuart Moodie, Falk Schreiber, Anatoly Sorokin, Vasundra Touré, Alice Villéger, Tobias Czauderna, Ugur Dogrusoz, Augustin Luna
The Systems Biology Graphical Notation (SBGN) is an international community effort to standardize the visualization of pathways and networks, making them accessible to scientists from diverse fields while facilitating efficient and ac-curate knowledge exchange among research communities, industry, and other stakeholders in systems biology. SBGN consists of three complementary languages - Entity Relationship (ER), Activity Flow (AF), and Process Description (PD) - each addressing biological and biochemical systems at varying levels of detail. PD, closely aligned with the metabolic and regulatory pathways found in biological literature, books, and academic courses, provides well-defined semantics for precisely representing biological information. The PD language uses a graph structure to represent mechanistic and temporal relationships of biological interactions and transformations. It incorporates distinct node types, including entity pools (e.g., metabolites, proteins, genes, and complexes) and processes (e.g., reactions and associations), with edges representing the connections between nodes (e.g., consumption, production, stimulation, and inhibition). This document details Level 1 Version 2.1 of the PD specification, including several improvements over the previous version (Level 1 Version 2.0): 1) refinements to document structure and terminology, 2) clarifications and updates to specification content, and 3) updated figures and rules.
{"title":"Systems biology graphical notation: process description language level 1 version 2.1.","authors":"Hasan Balci, Adrien Rougny, Rupert Overall, Irina Balaur, Michael L Blinov, Hanna Borlinghaus, Emek Demir, Andreas Dräger, Robin Haw, Alexander Mazein, Huaiyu Mi, Stuart Moodie, Falk Schreiber, Anatoly Sorokin, Vasundra Touré, Alice Villéger, Tobias Czauderna, Ugur Dogrusoz, Augustin Luna","doi":"10.1515/jib-2025-0018","DOIUrl":"https://doi.org/10.1515/jib-2025-0018","url":null,"abstract":"<p><p>The Systems Biology Graphical Notation (SBGN) is an international community effort to standardize the visualization of pathways and networks, making them accessible to scientists from diverse fields while facilitating efficient and ac-curate knowledge exchange among research communities, industry, and other stakeholders in systems biology. SBGN consists of three complementary languages - Entity Relationship (ER), Activity Flow (AF), and Process Description (PD) - each addressing biological and biochemical systems at varying levels of detail. PD, closely aligned with the metabolic and regulatory pathways found in biological literature, books, and academic courses, provides well-defined semantics for precisely representing biological information. The PD language uses a graph structure to represent mechanistic and temporal relationships of biological interactions and transformations. It incorporates distinct node types, including entity pools (e.g., metabolites, proteins, genes, and complexes) and processes (e.g., reactions and associations), with edges representing the connections between nodes (e.g., consumption, production, stimulation, and inhibition). This document details Level 1 Version 2.1 of the PD specification, including several improvements over the previous version (Level 1 Version 2.0): 1) refinements to document structure and terminology, 2) clarifications and updates to specification content, and 3) updated figures and rules.</p>","PeriodicalId":53625,"journal":{"name":"Journal of Integrative Bioinformatics","volume":" ","pages":""},"PeriodicalIF":1.8,"publicationDate":"2026-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146031556","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Elena Ignatieva, Sergey Lashin, Roman Ivanov, Valentin Suslov, Angelina Mikhailova, Nikolay Kolchanov
Appetite is an instinct that has been formed through evolution. Appetite promotes normal growth and development in humans. However, under conditions of food abundance, appetite can become excessive, posing significant health risks. In this study we have identified 80 human genes whose orthologs regulated food intake in model animal species. More than 80 % of these genes encode G-protein-coupled receptors and 29 % were found to be involved in developmental processes. Using phylostratigraphic age index (PAI), which specifies the evolutionary age of a gene, we found that this set of 80 genes contains an increased proportion of genes with the same phylostratigraphic age (PAI = 6, the stage of Vertebrata divergence) indicating the coordinated evolution of this group of genes. Using divergence index (DI), which indicates the type of selection to which the gene is subjected, we observed significant enrichment for genes with DI ≤ 0.25, i.e., those that are subject to strong stabilizing selection. The subgroup of genes having DI ≤ 0.25 included 45 genes and was enriched with genes that are associated with developmental processes. This finding supports the hypothesis that developmental disturbances generally impose strong constraints on viability due to purifying selection.
{"title":"Functional and evolutionary characteristics of human genes encoding cell surface receptors involved in the regulation of appetite.","authors":"Elena Ignatieva, Sergey Lashin, Roman Ivanov, Valentin Suslov, Angelina Mikhailova, Nikolay Kolchanov","doi":"10.1515/jib-2025-0023","DOIUrl":"10.1515/jib-2025-0023","url":null,"abstract":"<p><p>Appetite is an instinct that has been formed through evolution. Appetite promotes normal growth and development in humans. However, under conditions of food abundance, appetite can become excessive, posing significant health risks. In this study we have identified 80 human genes whose orthologs regulated food intake in model animal species. More than 80 % of these genes encode G-protein-coupled receptors and 29 % were found to be involved in developmental processes. Using phylostratigraphic age index (PAI), which specifies the evolutionary age of a gene, we found that this set of 80 genes contains an increased proportion of genes with the same phylostratigraphic age (PAI = 6, the stage of Vertebrata divergence) indicating the coordinated evolution of this group of genes. Using divergence index (DI), which indicates the type of selection to which the gene is subjected, we observed significant enrichment for genes with DI ≤ 0.25, i.e., those that are subject to strong stabilizing selection. The subgroup of genes having DI ≤ 0.25 included 45 genes and was enriched with genes that are associated with developmental processes. This finding supports the hypothesis that developmental disturbances generally impose strong constraints on viability due to purifying selection.</p>","PeriodicalId":53625,"journal":{"name":"Journal of Integrative Bioinformatics","volume":" ","pages":""},"PeriodicalIF":1.8,"publicationDate":"2026-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145913868","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Metabolomics studies require complex data processing pipelines to ensure data quality and extract meaningful biological insights. GetFeatistics is an R-package developed to streamline the elaboration and statistical analysis of metabolomics data. For targeted analyses, the package enables calibration curve-based quantification with different data weighting options. For untargeted studies, it includes dedicated functions to import feature tables from tools like patRoon and MS-DIAL, assign annotation confidence levels, and filter features based on pooled quality control (QC) criteria, including options for group-specific pooled QCs. The package also provides functions for univariate and multivariate statistical analyses, notably streamlined regression modelling with fixed effects, mixed-effects models for longitudinal data, and Tobit regression for censoring values exceeding the limits of detection. Output tables are concise and informative, facilitating interpretation and reporting, while output visualisations are fully customisable via the ggplot grammar. Additional functionalities include automated retrieval of chemical properties from PubChem, ontology classification via ClassyFire, and pathway enrichment analysis using the FELLA package. GetFeatistics is publicly available on GitHub, with comprehensive documentation and a step-by-step vignette. By integrating key steps of the metabolomics workflow, the package aims to facilitate both exploratory studies and large-scale epidemiological applications in metabolomics research.
{"title":"Streamlining feature elaboration and statistics analysis in metabolomics: the GetFeatistics R-package.","authors":"Gianfranco Frigerio","doi":"10.1515/jib-2025-0047","DOIUrl":"https://doi.org/10.1515/jib-2025-0047","url":null,"abstract":"<p><p>Metabolomics studies require complex data processing pipelines to ensure data quality and extract meaningful biological insights. GetFeatistics is an R-package developed to streamline the elaboration and statistical analysis of metabolomics data. For targeted analyses, the package enables calibration curve-based quantification with different data weighting options. For untargeted studies, it includes dedicated functions to import feature tables from tools like patRoon and MS-DIAL, assign annotation confidence levels, and filter features based on pooled quality control (QC) criteria, including options for group-specific pooled QCs. The package also provides functions for univariate and multivariate statistical analyses, notably streamlined regression modelling with fixed effects, mixed-effects models for longitudinal data, and Tobit regression for censoring values exceeding the limits of detection. Output tables are concise and informative, facilitating interpretation and reporting, while output visualisations are fully customisable via the ggplot grammar. Additional functionalities include automated retrieval of chemical properties from PubChem, ontology classification via ClassyFire, and pathway enrichment analysis using the FELLA package. GetFeatistics is publicly available on GitHub, with comprehensive documentation and a step-by-step vignette. By integrating key steps of the metabolomics workflow, the package aims to facilitate both exploratory studies and large-scale epidemiological applications in metabolomics research.</p>","PeriodicalId":53625,"journal":{"name":"Journal of Integrative Bioinformatics","volume":" ","pages":""},"PeriodicalIF":1.8,"publicationDate":"2025-12-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145812266","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-07-14eCollection Date: 2025-06-01DOI: 10.1515/jib-2024-0052
Rawan Gedeon, Atulya Nagar
This study investigates the application of deep learning techniques for segmenting glands in histopathological images of colorectal cancer. We trained two convolutional neural network models, U-Net and DCAN, on a combination of the GlaS and CRAG datasets to enhance generalization across diverse histological appearances, selecting DCAN for its superior accuracy in delineating gland boundaries. The goal was to achieve robust gland segmentation applicable to whole slide images (WSIs) from The Cancer Genome Atlas (TCGA). Using the segmented glands, we extracted patient-level morphological features and used them to predict survival outcomes. A Cox proportional hazards model was trained on these features and achieved a high concordance index, indicating strong predictive performance. Patients were then stratified into high- and low-risk groups, with significant differences in survival distributions (log-rank p-value: 0.01317). In addition, we benchmarked our models against state-of-the-art gland segmentation methods on GlaS and CRAG, highlighting the trade-off between domain-specific accuracy and cross-dataset robustness.
{"title":"Colon cancer survival prediction from gland shapes within histology slides using deep learning.","authors":"Rawan Gedeon, Atulya Nagar","doi":"10.1515/jib-2024-0052","DOIUrl":"10.1515/jib-2024-0052","url":null,"abstract":"<p><p>This study investigates the application of deep learning techniques for segmenting glands in histopathological images of colorectal cancer. We trained two convolutional neural network models, U-Net and DCAN, on a combination of the GlaS and CRAG datasets to enhance generalization across diverse histological appearances, selecting DCAN for its superior accuracy in delineating gland boundaries. The goal was to achieve robust gland segmentation applicable to whole slide images (WSIs) from The Cancer Genome Atlas (TCGA). Using the segmented glands, we extracted patient-level morphological features and used them to predict survival outcomes. A Cox proportional hazards model was trained on these features and achieved a high concordance index, indicating strong predictive performance. Patients were then stratified into high- and low-risk groups, with significant differences in survival distributions (log-rank <i>p</i>-value: 0.01317). In addition, we benchmarked our models against state-of-the-art gland segmentation methods on GlaS and CRAG, highlighting the trade-off between domain-specific accuracy and cross-dataset robustness.</p>","PeriodicalId":53625,"journal":{"name":"Journal of Integrative Bioinformatics","volume":" ","pages":""},"PeriodicalIF":1.8,"publicationDate":"2025-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12569585/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144621178","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-07-09eCollection Date: 2025-03-01DOI: 10.1515/jib-2025-0034
Ralf Hofestädt
{"title":"Editorial - 20 years Journal of Integrative Bioinformatics.","authors":"Ralf Hofestädt","doi":"10.1515/jib-2025-0034","DOIUrl":"10.1515/jib-2025-0034","url":null,"abstract":"","PeriodicalId":53625,"journal":{"name":"Journal of Integrative Bioinformatics","volume":" ","pages":""},"PeriodicalIF":1.8,"publicationDate":"2025-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12327197/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144585616","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-07-01eCollection Date: 2025-03-01DOI: 10.1515/jib-2025-0007
Falk Schreiber, Tobias Czauderna, Dimitar Garkov, Niklas Gröne, Karsten Klein, Matthias Lange, Uwe Scholz, Björn Sommer
Sustainable software development requires the software to remain accessible and maintainable over long time. This is particularly challenging in a scientific context. For example, fewer than one third of tools and platforms for biological network representation, analysis, and visualisation have been available and supported over a period of 15 years. One of those tools is Vanted, which has been developed and actively supported over the past 20 years. In this work, we discuss sustainable software development in science and investigate which software tools for biological network representation, analysis, and visualisation are maintained over a period of at least 15 years. With Vanted as a case study, we highlight five key insights that we consider crucial for sustainable, long-term software development and software maintenance in science.
{"title":"Sustainable software development in science - insights from 20 years of Vanted.","authors":"Falk Schreiber, Tobias Czauderna, Dimitar Garkov, Niklas Gröne, Karsten Klein, Matthias Lange, Uwe Scholz, Björn Sommer","doi":"10.1515/jib-2025-0007","DOIUrl":"10.1515/jib-2025-0007","url":null,"abstract":"<p><p>Sustainable software development requires the software to remain accessible and maintainable over long time. This is particularly challenging in a scientific context. For example, fewer than one third of tools and platforms for biological network representation, analysis, and visualisation have been available and supported over a period of 15 years. One of those tools is Vanted, which has been developed and actively supported over the past 20 years. In this work, we discuss sustainable software development in science and investigate which software tools for biological network representation, analysis, and visualisation are maintained over a period of at least 15 years. With Vanted as a case study, we highlight five key insights that we consider crucial for sustainable, long-term software development and software maintenance in science.</p>","PeriodicalId":53625,"journal":{"name":"Journal of Integrative Bioinformatics","volume":" ","pages":""},"PeriodicalIF":1.8,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12327203/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144531041","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This study investigates the gut microbiome and metabolome of asthma patients treated with inhaled corticosteroids (ICS), some of whom experience adverse side effects. We analyzed stool samples from 24 participants, divided into three cohorts: asthma patients with side effects, those without, and healthy controls. Using next-generation sequencing and LC-MS/MS metabolomics, we identified significant differences in bacterial species and metabolites. Multi-Omics Factor Analysis (MOFA) and Global Sensitivity Analysis-Partial Rank Correlation Coefficient (GSA-PRCC) provided insights into key contributors to side effects, such as tryptophan depletion and altered linolenate and glucose-1-phosphate levels. The study proposes dietary or probiotic interventions to mitigate side effects. Despite the limited sample size, these findings provide a basis for personalized asthma management approaches. Further studies are required to confirm initial fundings.
{"title":"Metagenome and metabolome study on inhaled corticosteroids in asthma patients with side effects.","authors":"Igor Goryanin, Anatoly Sorokin, Meder Seitov, Berik Emilov, Muktarbek Iskakov, Irina Goryanin, Batyr Osmonov","doi":"10.1515/jib-2024-0062","DOIUrl":"https://doi.org/10.1515/jib-2024-0062","url":null,"abstract":"<p><p>This study investigates the gut microbiome and metabolome of asthma patients treated with inhaled corticosteroids (ICS), some of whom experience adverse side effects. We analyzed stool samples from 24 participants, divided into three cohorts: asthma patients with side effects, those without, and healthy controls. Using next-generation sequencing and LC-MS/MS metabolomics, we identified significant differences in bacterial species and metabolites. Multi-Omics Factor Analysis (MOFA) and Global Sensitivity Analysis-Partial Rank Correlation Coefficient (GSA-PRCC) provided insights into key contributors to side effects, such as tryptophan depletion and altered linolenate and glucose-1-phosphate levels. The study proposes dietary or probiotic interventions to mitigate side effects. Despite the limited sample size, these findings provide a basis for personalized asthma management approaches. Further studies are required to confirm initial fundings.</p>","PeriodicalId":53625,"journal":{"name":"Journal of Integrative Bioinformatics","volume":" ","pages":""},"PeriodicalIF":1.5,"publicationDate":"2025-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144477807","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-06-23eCollection Date: 2025-06-01DOI: 10.1515/jib-2024-0047
Pablo Enrique Guillem, Marco Zurdo-Tabernero, Noelia Egido Iglesias, Ángel Canal-Alonso, Liliana Durón Figueroa, Guillermo Hernández, Angélica González-Arrieta, Fernando de la Prieta
The rapid advancement of Next-Generation Sequencing (NGS) technologies has revolutionized the field of genomics, producing large volumes of data that necessitate sophisticated analytical techniques. This paper introduces a Deep Learning model designed to predict the pathogenicity of genetic variants, a vital component in advancing personalized medicine. The model is trained on a dataset derived from the analysis of NGS outputs, containing a combination of well-defined and ambiguous genetic variants. By employing a semi-supervised learning approach, the model efficiently utilizes both confidently labeled and less certain data. At the core of the methodology is the Feature Tokenizer Transformer architecture, which processes both numerical and categorical genomic information. The preprocessing pipeline includes key steps such as data imputation, scaling, and encoding to ensure high data quality. The results highlight the model's impressive accuracy, particularly in detecting confidently labeled variants, while also addressing the impact of its predictions on less certain (soft-labeled) data.
{"title":"Leveraging transformers for semi-supervised pathogenicity prediction with soft labels.","authors":"Pablo Enrique Guillem, Marco Zurdo-Tabernero, Noelia Egido Iglesias, Ángel Canal-Alonso, Liliana Durón Figueroa, Guillermo Hernández, Angélica González-Arrieta, Fernando de la Prieta","doi":"10.1515/jib-2024-0047","DOIUrl":"10.1515/jib-2024-0047","url":null,"abstract":"<p><p>The rapid advancement of Next-Generation Sequencing (NGS) technologies has revolutionized the field of genomics, producing large volumes of data that necessitate sophisticated analytical techniques. This paper introduces a Deep Learning model designed to predict the pathogenicity of genetic variants, a vital component in advancing personalized medicine. The model is trained on a dataset derived from the analysis of NGS outputs, containing a combination of well-defined and ambiguous genetic variants. By employing a semi-supervised learning approach, the model efficiently utilizes both confidently labeled and less certain data. At the core of the methodology is the Feature Tokenizer Transformer architecture, which processes both numerical and categorical genomic information. The preprocessing pipeline includes key steps such as data imputation, scaling, and encoding to ensure high data quality. The results highlight the model's impressive accuracy, particularly in detecting confidently labeled variants, while also addressing the impact of its predictions on less certain (soft-labeled) data.</p>","PeriodicalId":53625,"journal":{"name":"Journal of Integrative Bioinformatics","volume":" ","pages":""},"PeriodicalIF":1.8,"publicationDate":"2025-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12569578/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144334418","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-06-23eCollection Date: 2025-03-01DOI: 10.1515/jib-2024-0037
Aruana F F Hansel-Fröse, Christoph Brinkrolf, Marcel Friedrichs, Bruno Dallagiovanna, Lucia Spangenberg
Stem cells are capable of self-renewal and differentiation into various cell types, showing significant potential for cellular therapies and regenerative medicine, particularly in cardiovascular diseases. The differentiation to cardiomyocytes replicates the embryonic heart development, potentially supporting cardiac regeneration. Cardiomyogenesis is controlled by complex post-transcriptional regulation that affects the construction of gene regulatory networks (GRNs), such as: alternative polyadenylation (APA), length changes in untranslated regulatory regions (3'UTRs), and microRNA (miRNA) regulation. To deepen our understanding of the cardiomyogenesis process, we have modeled a GRN for each day of cardiomyocyte differentiation. Then, each GRN was automatically transformed by four transformation rules to a Petri net and simulated using the software VANESA. The Petri nets highlighted the relationship between genes and alternative isoforms, emphasizing the inhibition of miRNA on APA isoforms with varying 3'UTR lengths. Moreover, in silico simulation of miRNA knockout enabled the visualization of the consequential effects on isoform expression. Our Petri net models provide a resourceful tool and holistic perspective to investigate the functional orchestra of transcript regulation that differentiate hESCs to cardiomyocytes. Additionally, the models can be adapted to investigate post-transcriptional GRN in other biological contexts.
{"title":"Petri net modeling and simulation of post-transcriptional regulatory networks of human embryonic stem cell (hESC) differentiation to cardiomyocytes.","authors":"Aruana F F Hansel-Fröse, Christoph Brinkrolf, Marcel Friedrichs, Bruno Dallagiovanna, Lucia Spangenberg","doi":"10.1515/jib-2024-0037","DOIUrl":"10.1515/jib-2024-0037","url":null,"abstract":"<p><p>Stem cells are capable of self-renewal and differentiation into various cell types, showing significant potential for cellular therapies and regenerative medicine, particularly in cardiovascular diseases. The differentiation to cardiomyocytes replicates the embryonic heart development, potentially supporting cardiac regeneration. Cardiomyogenesis is controlled by complex post-transcriptional regulation that affects the construction of gene regulatory networks (GRNs), such as: alternative polyadenylation (APA), length changes in untranslated regulatory regions (3'UTRs), and microRNA (miRNA) regulation. To deepen our understanding of the cardiomyogenesis process, we have modeled a GRN for each day of cardiomyocyte differentiation. Then, each GRN was automatically transformed by four transformation rules to a Petri net and simulated using the software VANESA. The Petri nets highlighted the relationship between genes and alternative isoforms, emphasizing the inhibition of miRNA on APA isoforms with varying 3'UTR lengths. Moreover, <i>in silico</i> simulation of miRNA knockout enabled the visualization of the consequential effects on isoform expression. Our Petri net models provide a resourceful tool and holistic perspective to investigate the functional orchestra of transcript regulation that differentiate hESCs to cardiomyocytes. Additionally, the models can be adapted to investigate post-transcriptional GRN in other biological contexts.</p>","PeriodicalId":53625,"journal":{"name":"Journal of Integrative Bioinformatics","volume":" ","pages":""},"PeriodicalIF":1.8,"publicationDate":"2025-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12327202/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144334419","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-06-18eCollection Date: 2025-06-01DOI: 10.1515/jib-2024-0057
Guilherme Henriques, Maryam Abbasi, Daniel Martins, Joel P Arrais
This study explores the use of deep learning to analyze genetic data and predict phenotypic traits associated with schizophrenia, a complex psychiatric disorder with a strong hereditary component yet incomplete genetic characterization. We applied Convolutional Neural Networks models to a large-scale case-control exome sequencing dataset from the Swedish population to identify genetic patterns linked to schizophrenia. To enhance model performance and reduce overfitting, we employed advanced optimization techniques, including dropout layers, learning rate scheduling, batch normalization, and early stopping. Following systematic refinements in data preprocessing, model architecture, and hyperparameter tuning, the final model achieved an accuracy of 80 %. These results demonstrate the potential of deep learning approaches to uncover intricate genotype-phenotype relationships and support their future integration into precision medicine and genetic diagnostics for psychiatric disorders such as schizophrenia.
{"title":"Integrating AI and genomics: predictive CNN models for schizophrenia phenotypes.","authors":"Guilherme Henriques, Maryam Abbasi, Daniel Martins, Joel P Arrais","doi":"10.1515/jib-2024-0057","DOIUrl":"10.1515/jib-2024-0057","url":null,"abstract":"<p><p>This study explores the use of deep learning to analyze genetic data and predict phenotypic traits associated with schizophrenia, a complex psychiatric disorder with a strong hereditary component yet incomplete genetic characterization. We applied Convolutional Neural Networks models to a large-scale case-control exome sequencing dataset from the Swedish population to identify genetic patterns linked to schizophrenia. To enhance model performance and reduce overfitting, we employed advanced optimization techniques, including dropout layers, learning rate scheduling, batch normalization, and early stopping. Following systematic refinements in data preprocessing, model architecture, and hyperparameter tuning, the final model achieved an accuracy of 80 %. These results demonstrate the potential of deep learning approaches to uncover intricate genotype-phenotype relationships and support their future integration into precision medicine and genetic diagnostics for psychiatric disorders such as schizophrenia.</p>","PeriodicalId":53625,"journal":{"name":"Journal of Integrative Bioinformatics","volume":" ","pages":""},"PeriodicalIF":1.8,"publicationDate":"2025-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12569582/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144310814","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}