Pub Date : 2025-11-17eCollection Date: 2025-01-01DOI: 10.3389/fbinf.2025.1663846
Osasan Stephen Adebayo, George Oche Ambrose, Daramola Olusola, Adefolalu Oluwafemi, Hind A Alzahrani, Abdulkarim Hasan
Introduction: KRAS mutations are key oncogenic drivers in lung cancer, yet effective pharmacological targeting has remained a major challenge due to the protein's elusive and dynamic binding pockets. Computational modeling offers a promising route to identify novel inhibitors with improved potency and selectivity.
Methods: A quantitative structure-activity relationship (QSAR) modeling approach was developed to predict the inhibitory potency (pIC50) of KRAS inhibitors and support de novo drug design. Molecular descriptors for 62 inhibitors retrieved from the ChEMBL database (CHEMBL4354832) were computed using Chemopy. Following descriptor normalization and dimensionality reduction, five machine learning algorithm spartial least squares (PLS), random forest (RF), stepwise multiple linear regression (MLR), genetic algorithm optimized MLR (GA-MLR), and XGBoost were applied. Model performance was evaluated using R2, RMSE, and MAE, while permutation-based importance and SHAP analyses provided feature interpretability.
Results: Among the models tested, PLS exhibited the best predictive performance (R2 = 0.851; RMSE = 0.292), followed by RF (R2 = 0.796). The GA-MLR model, based on eight optimized molecular descriptors, achieved good interpretability and robust internal validation (R2 = 0.677). Virtual screening of 56 de novo designed compounds within the model's applicability domain identified compound C9 with a predicted pIC50) of 8.11 as the most promising hit.
Discussion: This integrative QSAR modeling and de novo design framework effectively predicted the bioactivity of KRAS inhibitors and facilitated the identification of novel candidate molecules. The findings demonstrate the utility of combining interpretable machine learning models with virtual screening to accelerate the discovery of potent KRAS inhibitors for lung cancer therapy.
{"title":"QSAR-guided discovery of novel KRAS inhibitors for lung cancer therapy.","authors":"Osasan Stephen Adebayo, George Oche Ambrose, Daramola Olusola, Adefolalu Oluwafemi, Hind A Alzahrani, Abdulkarim Hasan","doi":"10.3389/fbinf.2025.1663846","DOIUrl":"10.3389/fbinf.2025.1663846","url":null,"abstract":"<p><strong>Introduction: </strong>KRAS mutations are key oncogenic drivers in lung cancer, yet effective pharmacological targeting has remained a major challenge due to the protein's elusive and dynamic binding pockets. Computational modeling offers a promising route to identify novel inhibitors with improved potency and selectivity.</p><p><strong>Methods: </strong>A quantitative structure-activity relationship (QSAR) modeling approach was developed to predict the inhibitory potency (pIC<sub>50</sub>) of KRAS inhibitors and support <i>de novo</i> drug design. Molecular descriptors for 62 inhibitors retrieved from the ChEMBL database (CHEMBL4354832) were computed using Chemopy. Following descriptor normalization and dimensionality reduction, five machine learning algorithm spartial least squares (PLS), random forest (RF), stepwise multiple linear regression (MLR), genetic algorithm optimized MLR (GA-MLR), and XGBoost were applied. Model performance was evaluated using <i>R</i> <sup>2</sup>, RMSE, and MAE, while permutation-based importance and SHAP analyses provided feature interpretability.</p><p><strong>Results: </strong>Among the models tested, PLS exhibited the best predictive performance (<i>R</i> <sup>2</sup> = 0.851; RMSE = 0.292), followed by RF (<i>R</i> <sup>2</sup> = 0.796). The GA-MLR model, based on eight optimized molecular descriptors, achieved good interpretability and robust internal validation (<i>R</i> <sup>2</sup> = 0.677). Virtual screening of 56 <i>de novo</i> designed compounds within the model's applicability domain identified compound C9 with a predicted pIC<sub>50</sub>) of 8.11 as the most promising hit.</p><p><strong>Discussion: </strong>This integrative QSAR modeling and <i>de novo</i> design framework effectively predicted the bioactivity of KRAS inhibitors and facilitated the identification of novel candidate molecules. The findings demonstrate the utility of combining interpretable machine learning models with virtual screening to accelerate the discovery of potent KRAS inhibitors for lung cancer therapy.</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"5 ","pages":"1663846"},"PeriodicalIF":3.9,"publicationDate":"2025-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12665777/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145662629","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-13eCollection Date: 2025-01-01DOI: 10.3389/fbinf.2025.1636240
Aanya Gupta, Koji Abe, Holden T Maecker
Introduction: FluPRINT is a multi-omics dataset that measures donors' protein expression and cell counts across various assays. Donors were also assigned a binary value (0 or 1), being labeled as high responders if they had a fold change ≥4 of the antibody titer for hemagglutination inhibition (HAI) from day 0 to day 28, and low responders otherwise (0). In this project, we used the MOFA and Stabl algorithms to analyze FluPRINT, estimate the population structure from the data, and identify the most important features for predicting response to the vaccine.
Methods: The preprocessing of the dataset included removing repeat features, scaling by assay, and removing outliers. Since Stabl does not directly address missing values, features with high amounts of missing values were removed and the remaining were ignored.
Results: MOFA identified the top feature in structure extraction as IL neg 2 CD4 pos CD45Ra neg pSTAT5. MOFA explains well the variance of the data while also choosing features that have good significance, as illustrated by their significant p-values (p < 0.05). Stabl found the top feature for explaining the outcome to be CD33- CD3+ CD4+ CD25hiCD127low CD161+ CD45RA + Tregs, which matched the top result of previously published analysis. MOFA's features achieved an AUROC of 0.616 (95% CI of 0.426-0.806), and Stabl's achieved an AUROC of 0.634 (95% CI of 0.432-0.823).
Discussion: Our research addresses a key knowledge gap: understanding how these fundamentally different analytical approaches perform when analyzing the same complex dataset. Our exploration evaluates their respective strengths, limitations, and biological insights and provides guidance on using MOFA and Stabl to find the best predictive cell subsets and features for understanding large immunological multi-omics data. The code for this project can be found at https://github.com/aanya21gupta/fluprint.
{"title":"Comprehensive analysis of multi-omics vaccine response data using MOFA and Stabl algorithms.","authors":"Aanya Gupta, Koji Abe, Holden T Maecker","doi":"10.3389/fbinf.2025.1636240","DOIUrl":"10.3389/fbinf.2025.1636240","url":null,"abstract":"<p><strong>Introduction: </strong>FluPRINT is a multi-omics dataset that measures donors' protein expression and cell counts across various assays. Donors were also assigned a binary value (0 or 1), being labeled as high responders if they had a fold change ≥4 of the antibody titer for hemagglutination inhibition (HAI) from day 0 to day 28, and low responders otherwise (0). In this project, we used the MOFA and Stabl algorithms to analyze FluPRINT, estimate the population structure from the data, and identify the most important features for predicting response to the vaccine.</p><p><strong>Methods: </strong>The preprocessing of the dataset included removing repeat features, scaling by assay, and removing outliers. Since Stabl does not directly address missing values, features with high amounts of missing values were removed and the remaining were ignored.</p><p><strong>Results: </strong>MOFA identified the top feature in structure extraction as IL neg 2 CD4 pos CD45Ra neg pSTAT5. MOFA explains well the variance of the data while also choosing features that have good significance, as illustrated by their significant p-values (p < 0.05). Stabl found the top feature for explaining the outcome to be CD33<sup>-</sup> CD3<sup>+</sup> CD4<sup>+</sup> CD25hiCD127low CD161+ CD45RA + Tregs, which matched the top result of previously published analysis. MOFA's features achieved an AUROC of 0.616 (95% CI of 0.426-0.806), and Stabl's achieved an AUROC of 0.634 (95% CI of 0.432-0.823).</p><p><strong>Discussion: </strong>Our research addresses a key knowledge gap: understanding how these fundamentally different analytical approaches perform when analyzing the same complex dataset. Our exploration evaluates their respective strengths, limitations, and biological insights and provides guidance on using MOFA and Stabl to find the best predictive cell subsets and features for understanding large immunological multi-omics data. The code for this project can be found at https://github.com/aanya21gupta/fluprint.</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"5 ","pages":"1636240"},"PeriodicalIF":3.9,"publicationDate":"2025-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12657425/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145649743","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Accurate prediction of antibody paratopes is a critical challenge in structure-limited, high-throughput discovery workflows. We present ParaDeep, a lightweight and interpretable deep learning framework for residue-level paratope prediction directly from amino acid sequences. ParaDeep integrates bidirectional long short-term memory networks with one-dimensional convolutional layers to capture both long-range sequence context and local binding motifs. We systematically evaluated 30 model configurations varying in encoding schemes, convolutional kernel sizes, and antibody chain types. In five-fold cross-validation, heavy (H) chain models achieved the highest performance (F1 = 0.856 ± 0.014, MCC = 0.842 ± 0.015), outperforming light (L) chain models (F1 = 0.774 ± 0.023, MCC = 0.772 ± 0.022). On an independent blind test set, ParaDeep attained F1 = 0.723 and MCC = 0.685 for H chains, and F1 = 0.607 and MCC = 0.587 for L chains, representing a 27% MCC improvement over the sequence-based baseline Parapred. Chain-specific modeling revealed that heavy chains provide stronger sequence-based predictive signals, while light chains benefit more from structural context. ParaDeep approaches the performance of state-of-the-art structure-based methods on heavy chains while requiring only sequence input, enabling faster and broader applicability without the computational cost of 3D modeling. Its efficiency and scalability make it well-suited for early-stage antibody discovery, repertoire profiling, and therapeutic design, particularly in the absence of structural data. The implementation is freely available at https://github.com/PiyachatU/ParaDeep, with Python (PyTorch) code and a Google Colab interface for ease of use.
{"title":"ParaDeep: sequence-based deep learning for residue-level paratope prediction using chain-aware BiLSTM-CNN models.","authors":"Piyachat Udomwong, Thanathat Pamonsupornwichit, Kanchanok Kodchakorn, Chatchai Tayapiwatana","doi":"10.3389/fbinf.2025.1684042","DOIUrl":"10.3389/fbinf.2025.1684042","url":null,"abstract":"<p><p>Accurate prediction of antibody paratopes is a critical challenge in structure-limited, high-throughput discovery workflows. We present ParaDeep, a lightweight and interpretable deep learning framework for residue-level paratope prediction directly from amino acid sequences. ParaDeep integrates bidirectional long short-term memory networks with one-dimensional convolutional layers to capture both long-range sequence context and local binding motifs. We systematically evaluated 30 model configurations varying in encoding schemes, convolutional kernel sizes, and antibody chain types. In five-fold cross-validation, heavy (H) chain models achieved the highest performance (F1 = 0.856 ± 0.014, MCC = 0.842 ± 0.015), outperforming light (L) chain models (F1 = 0.774 ± 0.023, MCC = 0.772 ± 0.022). On an independent blind test set, ParaDeep attained F1 = 0.723 and MCC = 0.685 for H chains, and F1 = 0.607 and MCC = 0.587 for L chains, representing a 27% MCC improvement over the sequence-based baseline Parapred. Chain-specific modeling revealed that heavy chains provide stronger sequence-based predictive signals, while light chains benefit more from structural context. ParaDeep approaches the performance of state-of-the-art structure-based methods on heavy chains while requiring only sequence input, enabling faster and broader applicability without the computational cost of 3D modeling. Its efficiency and scalability make it well-suited for early-stage antibody discovery, repertoire profiling, and therapeutic design, particularly in the absence of structural data. The implementation is freely available at https://github.com/PiyachatU/ParaDeep, with Python (PyTorch) code and a Google Colab interface for ease of use.</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"5 ","pages":"1684042"},"PeriodicalIF":3.9,"publicationDate":"2025-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12626946/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145566308","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-04eCollection Date: 2025-01-01DOI: 10.3389/fbinf.2025.1681811
Kylee K Rahm, Branden S Kinghorn, Myanna J Moody, Ben C Stone, Kenton C Strong, Brian S Kim, Yen Jou Chang, Samantha N Sleight, Alyssa A Nitz, David V Hansen, Matthew H Bailey
Introduction: Recent advances in Alzheimer's research suggest that the brain's immune system plays a critical role in the development and progression of this devastating disease. Microglial cells are vital as immune cells in the brain's defense system. Human Microglia Clone 3 (HMC3) is a cell line developed as a promising experimental model to understand the role of microglial cells in human diseases including Alzheimer's and other neurodegenerative diseases. The frequency of HMC3 cell usage has increased in recent years, with the idea that this cell line could serve as a convenient model for human microglial cell functions.
Methods: We utilized gene-pair ratios from bulk and single-cell RNA sequencing (scRNA-seq) expression data to create predictive models of cell-type origins.
Results: Our model reveals that the HMC3 cell line represents various cell types, with the highest cell similarity score relating to astrocytes, not microglia.
Discussion: These findings suggest that the HMC3 cell line is not a reliable human microglia model and that extreme caution should be taken when interpreting the results of studies using the HMC3 cell line.
{"title":"Cellf-deception: human microglia clone 3 (HMC3) cells exhibit more astrocyte-like than microglia-like gene expression.","authors":"Kylee K Rahm, Branden S Kinghorn, Myanna J Moody, Ben C Stone, Kenton C Strong, Brian S Kim, Yen Jou Chang, Samantha N Sleight, Alyssa A Nitz, David V Hansen, Matthew H Bailey","doi":"10.3389/fbinf.2025.1681811","DOIUrl":"10.3389/fbinf.2025.1681811","url":null,"abstract":"<p><strong>Introduction: </strong>Recent advances in Alzheimer's research suggest that the brain's immune system plays a critical role in the development and progression of this devastating disease. Microglial cells are vital as immune cells in the brain's defense system. Human Microglia Clone 3 (HMC3) is a cell line developed as a promising experimental model to understand the role of microglial cells in human diseases including Alzheimer's and other neurodegenerative diseases. The frequency of HMC3 cell usage has increased in recent years, with the idea that this cell line could serve as a convenient model for human microglial cell functions.</p><p><strong>Methods: </strong>We utilized gene-pair ratios from bulk and single-cell RNA sequencing (scRNA-seq) expression data to create predictive models of cell-type origins.</p><p><strong>Results: </strong>Our model reveals that the HMC3 cell line represents various cell types, with the highest cell similarity score relating to astrocytes, not microglia.</p><p><strong>Discussion: </strong>These findings suggest that the HMC3 cell line is not a reliable human microglia model and that extreme caution should be taken when interpreting the results of studies using the HMC3 cell line.</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"5 ","pages":"1681811"},"PeriodicalIF":3.9,"publicationDate":"2025-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12623408/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145558316","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-03eCollection Date: 2025-01-01DOI: 10.3389/fbinf.2025.1665892
Ana Stolnicu, Peter Eckhardt-Bellmann, Angelika M R Kestler, Hans A Kestler
Introduction: Numerous biological systems exhibit ordinal connections between categories. Developmental and time-series information inherently depict sequences like "early," "intermediate," and "late" phases, showing that these specific processes follow a progression. Ordinal classification techniques are often applied in biological and medical contexts, ranging from the evaluation of pain intensity, to the detection of evolving diseases, such as cancer. These ranking systems may assist clinicians in establishing diagnoses and developing tailored treatment plans. For instance, tumor staging might guide early detection strategies and targeted therapies, improving patient outcomes. However, applying ordinal classification to biological data presents considerable challenges. In addition to their high dimensionality, these datasets can be highly heterogeneous, often reflecting branching processes that occur simultaneously during progression. Factors such as intratumoral diversity, asynchronous progress, and context-specific signaling activity may interfere with the identification of such alternative development routes.
Methods: To address these challenges, we propose a framework for uncovering ordinal relationships within molecular data. Specifically, directed threshold classifiers are introduced as base learners for ordinal classifier cascades, enabling the detection of both total and partial orderings between molecular states.
Results: This approach preserves the inherent ordinal structure by projecting high-dimensional data onto one single dimension while simultaneously decreasing complexity. Additionally, the distinct features of the resulting thresholds allow the prediction of potential alternative paths among the suborders.
{"title":"Identification of ordinal relations and alternative suborders within high-dimensional molecular data.","authors":"Ana Stolnicu, Peter Eckhardt-Bellmann, Angelika M R Kestler, Hans A Kestler","doi":"10.3389/fbinf.2025.1665892","DOIUrl":"10.3389/fbinf.2025.1665892","url":null,"abstract":"<p><strong>Introduction: </strong>Numerous biological systems exhibit ordinal connections between categories. Developmental and time-series information inherently depict sequences like \"early,\" \"intermediate,\" and \"late\" phases, showing that these specific processes follow a progression. Ordinal classification techniques are often applied in biological and medical contexts, ranging from the evaluation of pain intensity, to the detection of evolving diseases, such as cancer. These ranking systems may assist clinicians in establishing diagnoses and developing tailored treatment plans. For instance, tumor staging might guide early detection strategies and targeted therapies, improving patient outcomes. However, applying ordinal classification to biological data presents considerable challenges. In addition to their high dimensionality, these datasets can be highly heterogeneous, often reflecting branching processes that occur simultaneously during progression. Factors such as intratumoral diversity, asynchronous progress, and context-specific signaling activity may interfere with the identification of such alternative development routes.</p><p><strong>Methods: </strong>To address these challenges, we propose a framework for uncovering ordinal relationships within molecular data. Specifically, directed threshold classifiers are introduced as base learners for ordinal classifier cascades, enabling the detection of both total and partial orderings between molecular states.</p><p><strong>Results: </strong>This approach preserves the inherent ordinal structure by projecting high-dimensional data onto one single dimension while simultaneously decreasing complexity. Additionally, the distinct features of the resulting thresholds allow the prediction of potential alternative paths among the suborders.</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"5 ","pages":"1665892"},"PeriodicalIF":3.9,"publicationDate":"2025-11-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12620363/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145552026","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-31eCollection Date: 2025-01-01DOI: 10.3389/fbinf.2025.1705252
Yaan J Jang
{"title":"Editorial: Computational protein function prediction based on sequence and/or structural data.","authors":"Yaan J Jang","doi":"10.3389/fbinf.2025.1705252","DOIUrl":"10.3389/fbinf.2025.1705252","url":null,"abstract":"","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"5 ","pages":"1705252"},"PeriodicalIF":3.9,"publicationDate":"2025-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12615499/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145544048","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-31eCollection Date: 2025-01-01DOI: 10.3389/fbinf.2025.1645520
Helya Goharbavang, Artem T Ashitkov, Athira Pillai, Joshua D Wythe, Guoning Chen, David Mayerich
Recent advances in three-dimensional microscopy enable imaging of whole-organ microvascular networks in small animals. Since microvasculature plays a crucial role in tissue development and function, its structure may provide diagnostic biomarkers and insight into disease progression. However, the microscopy community currently lacks benchmarks for scalable algorithms to measure these potential biomarkers. While many algorithms exist for segmenting vessel-like structures and extracting their surface features and connectivity, they have not been thoroughly evaluated on modern gigavoxel-scale images. In this paper, we propose a comprehensive yet compact survey of available algorithms. We focus on essential features for microvascular analysis, including extracting vessel surfaces and the network's associated connectivity. We select a series of algorithms based on popularity and availability and provide a thorough quantitative analysis of their performance on datasets acquired using light sheet fluorescence microscopy (LSFM), knife-edge scanning microscopy (KESM), and X-ray microtomography (µ-CT).
{"title":"Segmentation and modeling of large-scale microvascular networks: a survey.","authors":"Helya Goharbavang, Artem T Ashitkov, Athira Pillai, Joshua D Wythe, Guoning Chen, David Mayerich","doi":"10.3389/fbinf.2025.1645520","DOIUrl":"10.3389/fbinf.2025.1645520","url":null,"abstract":"<p><p>Recent advances in three-dimensional microscopy enable imaging of whole-organ microvascular networks in small animals. Since microvasculature plays a crucial role in tissue development and function, its structure may provide diagnostic biomarkers and insight into disease progression. However, the microscopy community currently lacks benchmarks for scalable algorithms to measure these potential biomarkers. While many algorithms exist for segmenting vessel-like structures and extracting their surface features and connectivity, they have not been thoroughly evaluated on modern gigavoxel-scale images. In this paper, we propose a comprehensive yet compact survey of available algorithms. We focus on essential features for microvascular analysis, including extracting vessel surfaces and the network's associated connectivity. We select a series of algorithms based on popularity and availability and provide a thorough quantitative analysis of their performance on datasets acquired using light sheet fluorescence microscopy (LSFM), knife-edge scanning microscopy (KESM), and X-ray microtomography (µ-CT).</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"5 ","pages":"1645520"},"PeriodicalIF":3.9,"publicationDate":"2025-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12616183/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145544065","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-30eCollection Date: 2025-01-01DOI: 10.3389/fbinf.2025.1693343
Genevieve Laprade, Quinn Lee, Kristin L Gallik, Michael Nelson, Natalie Woo, Celina Terán Ramírez, Alexis Ricardo Becerril Cuevas, Kevin W Eliceiri, Corinne Esquibel
The fields of bioimaging and image analysis are rapidly expanding as new technologies transform biological questions into novel insights. While professionals of varying expertise are essential to achieving these advancements, early-career scientists-a prominent user group within the imaging community-are often assumed to have the prerequisite knowledge and ability to use these tools. This demographic, consisting of students, post-docs, and bioimage analysis trainees, is critical for the field to continue to evolve and flourish. However, obstacles such as geographic location, language barriers, insufficient funding or training, and instrument availability hinder access to resources and introduce significant knowledge gaps, especially for scientists in early-career stages. Democratized resources for bioimaging and analysis such as forums, community organizations, and publicly available datasets have been helpful in overcoming barriers to access for early-career scientists. Here, we discuss the current tools and resources available for early-career researchers, highlight their limitations from the learners' perspective, and propose strategies to better support this group. As bioimage analysis extends broadly into many scientific disciplines, we implore all members of this community, regardless of experience level, to empower next-generation scientists.
{"title":"The importance of democratized resources in early-career training for bioimage analysts and bioimaging scientists.","authors":"Genevieve Laprade, Quinn Lee, Kristin L Gallik, Michael Nelson, Natalie Woo, Celina Terán Ramírez, Alexis Ricardo Becerril Cuevas, Kevin W Eliceiri, Corinne Esquibel","doi":"10.3389/fbinf.2025.1693343","DOIUrl":"10.3389/fbinf.2025.1693343","url":null,"abstract":"<p><p>The fields of bioimaging and image analysis are rapidly expanding as new technologies transform biological questions into novel insights. While professionals of varying expertise are essential to achieving these advancements, early-career scientists-a prominent user group within the imaging community-are often assumed to have the prerequisite knowledge and ability to use these tools. This demographic, consisting of students, post-docs, and bioimage analysis trainees, is critical for the field to continue to evolve and flourish. However, obstacles such as geographic location, language barriers, insufficient funding or training, and instrument availability hinder access to resources and introduce significant knowledge gaps, especially for scientists in early-career stages. Democratized resources for bioimaging and analysis such as forums, community organizations, and publicly available datasets have been helpful in overcoming barriers to access for early-career scientists. Here, we discuss the current tools and resources available for early-career researchers, highlight their limitations from the learners' perspective, and propose strategies to better support this group. As bioimage analysis extends broadly into many scientific disciplines, we implore all members of this community, regardless of experience level, to empower next-generation scientists.</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"5 ","pages":"1693343"},"PeriodicalIF":3.9,"publicationDate":"2025-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12611831/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145544038","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-29eCollection Date: 2025-01-01DOI: 10.3389/fbinf.2025.1674179
Priscila Galvão Doria, Gisele Vieira Rocha, Vanessa Dybal Bertoni, Roberto de Souza Batista Dos Santos, Mariana Araújo-Pereira, Clarissa Gurgel
Introduction: Colon cancer is a common disease, treated with few chemotherapeutic agents with similar treatment sequencing despite its heterogeneity. A significant proportion of patients are diagnosed with metastasis, and resistance to antineoplastic drugs is associated with disease progression and therapeutic failure. It is known that the tumor microenvironment plays an essential role in cancer progression, contributing to processes that may be associated with therapeutic resistance mechanisms in colon cancer. In this study, we aim to identify a gene expression signature and its relationship with immune cell infiltration in colon cancer, contributing to the identification of potential resistance biomarkers.
Methods: An in silico study was conducted using RNA-seq data from The Cancer Genome Atlas Program (TCGA) samples, subdivided into two groups (treatment-resistant and non-resistant), taking into account the molecular subgroups (CMS1, CMS2, CMS3, and CMS4). The following algorithms were used: i. Limma was applied to identify differentially expressed genes; ii. WGCNA was applied to construct co-expression networks; iii. CIBERSORT was applied to estimate the proportion of infiltrating immune cells; and iv. TIMER was applied to explore the relationship between core genes and immune cell content.
Results: Twenty differentially expressed genes (DEGs) were found, with 18 related to the group considered resistant to oncologic treatment and presenting poorer overall survival. T CD4 memory resting cells and M0 and M2 macrophages were found in more significant proportions in the analyzed samples and more infiltrated in the tumor microenvironment, the higher the expression of some of these resistance DEGs. Additionally, these genes correlate with biological aspects of neuronal differentiation, axogenesis, and synaptic transmission.
Conclusion: The gene expression signature suggests the presence of differentially expressed synaptic membrane genes, which may be involved in neuronal pathways that influence the tumor microenvironment, potentially serving as future biomarkers. Furthermore, the presence of M0 and M2 macrophages and T CD4 memory resting cells suggests a potential interaction that may play a role in therapeutic resistance.
{"title":"Gene expression profile in colon cancer therapeutic resistance and its relationship with the tumor microenvironment.","authors":"Priscila Galvão Doria, Gisele Vieira Rocha, Vanessa Dybal Bertoni, Roberto de Souza Batista Dos Santos, Mariana Araújo-Pereira, Clarissa Gurgel","doi":"10.3389/fbinf.2025.1674179","DOIUrl":"10.3389/fbinf.2025.1674179","url":null,"abstract":"<p><strong>Introduction: </strong>Colon cancer is a common disease, treated with few chemotherapeutic agents with similar treatment sequencing despite its heterogeneity. A significant proportion of patients are diagnosed with metastasis, and resistance to antineoplastic drugs is associated with disease progression and therapeutic failure. It is known that the tumor microenvironment plays an essential role in cancer progression, contributing to processes that may be associated with therapeutic resistance mechanisms in colon cancer. In this study, we aim to identify a gene expression signature and its relationship with immune cell infiltration in colon cancer, contributing to the identification of potential resistance biomarkers.</p><p><strong>Methods: </strong>An <i>in silico</i> study was conducted using RNA-seq data from The Cancer Genome Atlas Program (TCGA) samples, subdivided into two groups (treatment-resistant and non-resistant), taking into account the molecular subgroups (CMS1, CMS2, CMS3, and CMS4). The following algorithms were used: i. <i>Limma</i> was applied to identify differentially expressed genes; ii. WGCNA was applied to construct co-expression networks; iii. CIBERSORT was applied to estimate the proportion of infiltrating immune cells; and iv. TIMER was applied to explore the relationship between core genes and immune cell content.</p><p><strong>Results: </strong>Twenty differentially expressed genes (DEGs) were found, with 18 related to the group considered resistant to oncologic treatment and presenting poorer overall survival. T CD4 memory resting cells and M0 and M2 macrophages were found in more significant proportions in the analyzed samples and more infiltrated in the tumor microenvironment, the higher the expression of some of these resistance DEGs. Additionally, these genes correlate with biological aspects of neuronal differentiation, axogenesis, and synaptic transmission.</p><p><strong>Conclusion: </strong>The gene expression signature suggests the presence of differentially expressed synaptic membrane genes, which may be involved in neuronal pathways that influence the tumor microenvironment, potentially serving as future biomarkers. Furthermore, the presence of M0 and M2 macrophages and T CD4 memory resting cells suggests a potential interaction that may play a role in therapeutic resistance.</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"5 ","pages":"1674179"},"PeriodicalIF":3.9,"publicationDate":"2025-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12604976/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145515086","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-24eCollection Date: 2025-01-01DOI: 10.3389/fbinf.2025.1708311
Giulia Ghisleni, Christian Stolte, Megan Gozzard, Lea Von Soosten, Antonia Bruno
This perspective paper examines the profound cognitive and methodological parallels between scientific and artistic research, challenging the traditional distinction between the two domains. While science and art use different languages, both emerge from the human drive for creativity and understanding. We argue that scientific inquiry, often presented as strictly objective and methodical, inherently shares with art the need for imagination, flexibility, and interpretative thinking. Drawing on neuroscience, education, design theory, and the visual arts, we highlight how artistic practices, particularly in the visual arts, can enhance scientific learning, innovation, and public engagement. We advocate integrating art into scientific training and research to foster a more creative and inclusive epistemology. Through examples in microbiology, education, and data visualization, we show how the arts can support deeper understanding, cross-disciplinary collaboration, and more effective science communication. Ultimately, we call for a shift toward a more integrated approach that embraces the complementary strengths of both art and science in advancing knowledge and societal impact.
{"title":"Why science needs art.","authors":"Giulia Ghisleni, Christian Stolte, Megan Gozzard, Lea Von Soosten, Antonia Bruno","doi":"10.3389/fbinf.2025.1708311","DOIUrl":"10.3389/fbinf.2025.1708311","url":null,"abstract":"<p><p>This perspective paper examines the profound cognitive and methodological parallels between scientific and artistic research, challenging the traditional distinction between the two domains. While science and art use different languages, both emerge from the human drive for creativity and understanding. We argue that scientific inquiry, often presented as strictly objective and methodical, inherently shares with art the need for imagination, flexibility, and interpretative thinking. Drawing on neuroscience, education, design theory, and the visual arts, we highlight how artistic practices, particularly in the visual arts, can enhance scientific learning, innovation, and public engagement. We advocate integrating art into scientific training and research to foster a more creative and inclusive epistemology. Through examples in microbiology, education, and data visualization, we show how the arts can support deeper understanding, cross-disciplinary collaboration, and more effective science communication. Ultimately, we call for a shift toward a more integrated approach that embraces the complementary strengths of both art and science in advancing knowledge and societal impact.</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"5 ","pages":"1708311"},"PeriodicalIF":3.9,"publicationDate":"2025-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12592062/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145483967","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}