Automated leaf segmentation pipelines must balance accuracy, scalability, and usability to be readily adopted in plant research. We present an end-to-end deep learning pipeline designed for practical use in plant phenotyping, which we developed and evaluated during a real-world plant growth experiment using Atriplex lentiformis. The pipeline integrates a fine-tuned Mask Region-based Convolutional Neural Network (Mask R-CNN) segmentation model trained on 176 plant images and achieves high performance despite the small training data set (Dice coefficient = 0.781). We quantitatively compare the fine-tuned Mask R-CNN model to Meta AI's Segment Anything Model (SAM) and evaluate natural language prompts using Grounded SAM and the Leaf-Only SAM post-processing pipeline for refining segmentation outputs. Our findings highlight that transfer learning on a specialized data set can still outperform a large foundation model in domain-specific tasks. In addition, we integrate QR codes for automated sample identification and benchmark multiple QR code decoding libraries, evaluating their robustness under real-world imaging conditions like distortion and lighting variation. To ensure accessibility, we deploy the pipeline as a user-friendly Streamlit web application, allowing researchers to analyze images without deep learning expertise. By focusing on practical deployment in addition to model performance, this study provides an open-source, scalable framework for plant science applications and addresses real-world challenges in automation and usability by the end-researcher.
{"title":"A User-Friendly Machine Learning Pipeline for Automated Leaf Segmentation in <i>Atriplex lentiformis</i>.","authors":"Michelle Lynn Yung, Kamila Murawska-Wlodarczyk, Alicja Babst-Kostecka, Raina Margaret Maier, Nirav Merchant, Aikseng Ooi","doi":"10.1177/11779322251344033","DOIUrl":"10.1177/11779322251344033","url":null,"abstract":"<p><p>Automated leaf segmentation pipelines must balance accuracy, scalability, and usability to be readily adopted in plant research. We present an end-to-end deep learning pipeline designed for practical use in plant phenotyping, which we developed and evaluated during a real-world plant growth experiment using <i>Atriplex lentiformis</i>. The pipeline integrates a fine-tuned Mask Region-based Convolutional Neural Network (Mask R-CNN) segmentation model trained on 176 plant images and achieves high performance despite the small training data set (Dice coefficient = 0.781). We quantitatively compare the fine-tuned Mask R-CNN model to Meta AI's Segment Anything Model (SAM) and evaluate natural language prompts using Grounded SAM and the Leaf-Only SAM post-processing pipeline for refining segmentation outputs. Our findings highlight that transfer learning on a specialized data set can still outperform a large foundation model in domain-specific tasks. In addition, we integrate QR codes for automated sample identification and benchmark multiple QR code decoding libraries, evaluating their robustness under real-world imaging conditions like distortion and lighting variation. To ensure accessibility, we deploy the pipeline as a user-friendly Streamlit web application, allowing researchers to analyze images without deep learning expertise. By focusing on practical deployment in addition to model performance, this study provides an open-source, scalable framework for plant science applications and addresses real-world challenges in automation and usability by the end-researcher.</p>","PeriodicalId":9065,"journal":{"name":"Bioinformatics and Biology Insights","volume":"19 ","pages":"11779322251344033"},"PeriodicalIF":2.3,"publicationDate":"2025-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12149614/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144265239","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Moyamoya disease (MMD) is a rare, chronic cerebrovascular disorder of uncertain etiology. Although abnormal glucose metabolism has been implicated, the contribution of glycosylation-related genes in MMD remains elusive. In this study, we analyzed 2 transcriptome data sets (GSE189993 and GSE131293) from the Gene Expression Omnibus (GEO) database to identify 723 differentially expressed genes (DEGs) between MMD patients and controls. Intersection genes with known glycosylation-related genes underwent Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analyses. We utilized machine learning to select key hub genes, followed by immune cell infiltration and correlation analyses. In-depth immune cell analysis indicated that both CFP and MGAT5B were closely tied to various immune components, suggesting potential crosstalk between glycosylation pathways and immune regulation. Notably, CFP was positively associated with pDCs, HLA, and CCR, whereas MGAT5B correlated with B-cells, check-points, and T helper cells but showed a negative relationship with Tregs, hinting at an immunoregulatory mechanism influencing MMD progression. Motif-TF annotation highlighted csibp_M2095 as the motif with the highest normalized enrichment score (NES: 6.57). Reverse microRNA (miRNA)-gene prediction identified 75 miRNAs regulating these focus genes, along with 126 miRNA-miRNA interconnections. Connectivity Map (Cmap) analysis revealed that Chenodeoxycholic acid, MRS-1220, Phenytoin, and Piceid were strongly negatively correlated with MMD expression profiles, suggesting potential therapeutic candidates. Enzyme-linked immunosorbent assays confirmed elevated CFP and MGAT5B and reduced PTPN11 in MMD, aligning with our bioinformatic findings. Moreover, PTPN11 knockdown in human brain microvascular endothelial cells (HBMECs) significantly enhanced tube formation, indicating a role in vascular remodeling. Collectively, these results emphasize the importance of glycosylation-related genes and immune dysregulation in MMD pathogenesis. These findings broaden our understanding of MMD's underlying mechanisms and underscore the necessity of continued research into glycosylation-driven pathways for improved disease management.
{"title":"Integrative Machine Learning Approach to Explore Glycosylation Signatures and Immune Landscape in Moyamoya Disease.","authors":"Cunxin Tan, Jing Wang, Yanru Wang, Shaoqi Xu, Zhenyu Zhou, Junze Zhang, Shihao He, Ran Duan","doi":"10.1177/11779322251342412","DOIUrl":"10.1177/11779322251342412","url":null,"abstract":"<p><p>Moyamoya disease (MMD) is a rare, chronic cerebrovascular disorder of uncertain etiology. Although abnormal glucose metabolism has been implicated, the contribution of glycosylation-related genes in MMD remains elusive. In this study, we analyzed 2 transcriptome data sets (GSE189993 and GSE131293) from the Gene Expression Omnibus (GEO) database to identify 723 differentially expressed genes (DEGs) between MMD patients and controls. Intersection genes with known glycosylation-related genes underwent Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analyses. We utilized machine learning to select key hub genes, followed by immune cell infiltration and correlation analyses. In-depth immune cell analysis indicated that both CFP and MGAT5B were closely tied to various immune components, suggesting potential crosstalk between glycosylation pathways and immune regulation. Notably, CFP was positively associated with pDCs, HLA, and CCR, whereas MGAT5B correlated with B-cells, check-points, and T helper cells but showed a negative relationship with Tregs, hinting at an immunoregulatory mechanism influencing MMD progression. Motif-TF annotation highlighted csibp_M2095 as the motif with the highest normalized enrichment score (NES: 6.57). Reverse microRNA (miRNA)-gene prediction identified 75 miRNAs regulating these focus genes, along with 126 miRNA-miRNA interconnections. Connectivity Map (Cmap) analysis revealed that Chenodeoxycholic acid, MRS-1220, Phenytoin, and Piceid were strongly negatively correlated with MMD expression profiles, suggesting potential therapeutic candidates. Enzyme-linked immunosorbent assays confirmed elevated CFP and MGAT5B and reduced PTPN11 in MMD, aligning with our bioinformatic findings. Moreover, PTPN11 knockdown in human brain microvascular endothelial cells (HBMECs) significantly enhanced tube formation, indicating a role in vascular remodeling. Collectively, these results emphasize the importance of glycosylation-related genes and immune dysregulation in MMD pathogenesis. These findings broaden our understanding of MMD's underlying mechanisms and underscore the necessity of continued research into glycosylation-driven pathways for improved disease management.</p>","PeriodicalId":9065,"journal":{"name":"Bioinformatics and Biology Insights","volume":"19 ","pages":"11779322251342412"},"PeriodicalIF":2.3,"publicationDate":"2025-05-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12103670/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144141431","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-05-21eCollection Date: 2025-01-01DOI: 10.1177/11779322251339698
Francisco Alejandro Lagunas-Rangel
Sirtuin 6 (SIRT6), a member of the class III histone deacetylase (HDAC) family, is crucial for the maintenance of general health and is associated with increased life expectancy and resistance to age-related diseases such as cancer and metabolic disorders. A comparative analysis of the SIRT6 gene in Ashkenazi Jewish (AJ) centenarians and noncentenarian controls found a distinct allele, centSIRT6, enriched in the centenarian group. This allele features 2 linked substitutions, N308K and A313S, and exhibits enhanced functions, including more efficient suppression of LINE1 retrotransposons, improved repair of DNA double-strand breaks, and increased efficiency in cancer cell killing. Notably, centSIRT6 shows lower deacetylase activity but higher mono-adenosine diphosphate (ADP) ribosyl transferase activity compared with the wild-type enzyme. This study used several bioinformatics tools to explore the structural changes caused by the N308K and A313S substitutions in centSIRT6 and to elucidate how these alterations contribute to changes in the enzymatic activities of SIRT6. The results indicate that these mutations reduce the structural flexibility of centSIRT6, thus weakening its interactions with acetyl-lysine but strengthening its interactions with ADP-ribose. This research provides useful information for future experimental studies to further investigate the molecular mechanisms of centSIRT6.
{"title":"Structural Insights Into centSIRT6: Bioinformatic Analysis of N308K and A313S Substitution Effects.","authors":"Francisco Alejandro Lagunas-Rangel","doi":"10.1177/11779322251339698","DOIUrl":"10.1177/11779322251339698","url":null,"abstract":"<p><p>Sirtuin 6 (SIRT6), a member of the class III histone deacetylase (HDAC) family, is crucial for the maintenance of general health and is associated with increased life expectancy and resistance to age-related diseases such as cancer and metabolic disorders. A comparative analysis of the SIRT6 gene in Ashkenazi Jewish (AJ) centenarians and noncentenarian controls found a distinct allele, centSIRT6, enriched in the centenarian group. This allele features 2 linked substitutions, N308K and A313S, and exhibits enhanced functions, including more efficient suppression of LINE1 retrotransposons, improved repair of DNA double-strand breaks, and increased efficiency in cancer cell killing. Notably, centSIRT6 shows lower deacetylase activity but higher mono-adenosine diphosphate (ADP) ribosyl transferase activity compared with the wild-type enzyme. This study used several bioinformatics tools to explore the structural changes caused by the N308K and A313S substitutions in centSIRT6 and to elucidate how these alterations contribute to changes in the enzymatic activities of SIRT6. The results indicate that these mutations reduce the structural flexibility of centSIRT6, thus weakening its interactions with acetyl-lysine but strengthening its interactions with ADP-ribose. This research provides useful information for future experimental studies to further investigate the molecular mechanisms of centSIRT6.</p>","PeriodicalId":9065,"journal":{"name":"Bioinformatics and Biology Insights","volume":"19 ","pages":"11779322251339698"},"PeriodicalIF":2.3,"publicationDate":"2025-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12099093/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144141432","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-04-16eCollection Date: 2025-01-01DOI: 10.1177/11779322251331104
Abanti Barua, Md Habib Ullah Masum, Ahmad Abdullah Mahdeen
Helicobacter pylori infection of the stomach's epithelial cells is a significant risk factor for stomach cancer. Various H pylori proteins (CagA, GGT, NapA, PatA, urease, and VacA) were targeted to design 2 messenger RNA (mRNA) vaccines, V1 and V2, using bioinformatics tools. Physicochemical parameters, secondary and tertiary structure, molecular docking and dynamic simulation, codon optimization, and RNA structure prediction have also been estimated for these developed vaccines. Physicochemical analyses revealed that these developed vaccines are soluble (GRAVY < 0), basic (pI < 7), and stable (aliphatic index < 80). The secondary and tertiary structure of the vaccines demonstrated robustness. The docking with toll-like receptors (TLRs) revealed that the vaccines have a potential affinity for TLR-2 (V1: -1132.3 kJ/mol, V2: -1093.6 kJ/mol) and TLR-4 (V1: -1042.7 kJ/mol, V2: -1201.2 kJ/mol), and molecular dynamics simulations confirmed their dynamic stability. Structural analyses of V1 (-505.96 kcal/mol) and V2 (-634.92 kcal/mol) mRNA vaccines underscored their stability. In addition, the vaccine showed a considerable rise in the counts of B cells and extended activation of both T cells was also observed for the vaccines, suggesting the potential for long-lasting immunity, and offering enhanced protection against H pylori. These findings not only suggest potential long-lasting immunity against H pylori but also offer hope for the future of stomach cancer prevention. Notably, the study emphasizes the need for subsequent animal and human-based studies to confirm these promising results.
{"title":"A Reverse Vaccinology and Immunoinformatic Approach for the Designing of a Novel mRNA Vaccine Against Stomach Cancer Targeting the Potent Pathogenic Proteins of <i>Helicobacter pylori</i>.","authors":"Abanti Barua, Md Habib Ullah Masum, Ahmad Abdullah Mahdeen","doi":"10.1177/11779322251331104","DOIUrl":"https://doi.org/10.1177/11779322251331104","url":null,"abstract":"<p><p><i>Helicobacter pylori</i> infection of the stomach's epithelial cells is a significant risk factor for stomach cancer. Various <i>H pylori</i> proteins (CagA, GGT, NapA, PatA, urease, and VacA) were targeted to design 2 messenger RNA (mRNA) vaccines, V1 and V2, using bioinformatics tools. Physicochemical parameters, secondary and tertiary structure, molecular docking and dynamic simulation, codon optimization, and RNA structure prediction have also been estimated for these developed vaccines. Physicochemical analyses revealed that these developed vaccines are soluble (GRAVY < 0), basic (pI < 7), and stable (aliphatic index < 80). The secondary and tertiary structure of the vaccines demonstrated robustness. The docking with toll-like receptors (TLRs) revealed that the vaccines have a potential affinity for TLR-2 (V1: -1132.3 kJ/mol, V2: -1093.6 kJ/mol) and TLR-4 (V1: -1042.7 kJ/mol, V2: -1201.2 kJ/mol), and molecular dynamics simulations confirmed their dynamic stability. Structural analyses of V1 (-505.96 kcal/mol) and V2 (-634.92 kcal/mol) mRNA vaccines underscored their stability. In addition, the vaccine showed a considerable rise in the counts of B cells and extended activation of both T cells was also observed for the vaccines, suggesting the potential for long-lasting immunity, and offering enhanced protection against <i>H pylori</i>. These findings not only suggest potential long-lasting immunity against <i>H pylori</i> but also offer hope for the future of stomach cancer prevention. Notably, the study emphasizes the need for subsequent animal and human-based studies to confirm these promising results.</p>","PeriodicalId":9065,"journal":{"name":"Bioinformatics and Biology Insights","volume":"19 ","pages":"11779322251331104"},"PeriodicalIF":2.3,"publicationDate":"2025-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12033411/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143967723","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-04-12eCollection Date: 2025-01-01DOI: 10.1177/11779322251328269
Manh Hung Le, Nam Anh Dao, Xuan Tho Dang
Drug repositioning holds great promise for reducing the time and cost associated with traditional drug discovery, but it faces significant challenges related to data imbalance and noise in negative samples. In this article, we introduce a novel method leveraging high negative oversampling (HNO) to address these challenges. Our approach integrates HNO with advanced techniques such as network-based graph mining, matrix factorization, and Bayesian inference, specifically designed for imbalanced data scenarios. Constructing high-quality negative samples is crucial to mitigate the detrimental effects of noisy negative data and enhance model performance. Experimental results demonstrate the efficacy of our approach in enhancing the performance of drug discovery models by effectively managing data imbalance and refining the selection of negative samples. This methodology provides a robust framework for improving drug repositioning, with potential applications in broader biomedical domains.
{"title":"Bayesian Inference for Drug Discovery by High Negative Samples and Oversampling.","authors":"Manh Hung Le, Nam Anh Dao, Xuan Tho Dang","doi":"10.1177/11779322251328269","DOIUrl":"https://doi.org/10.1177/11779322251328269","url":null,"abstract":"<p><p>Drug repositioning holds great promise for reducing the time and cost associated with traditional drug discovery, but it faces significant challenges related to data imbalance and noise in negative samples. In this article, we introduce a novel method leveraging high negative oversampling (HNO) to address these challenges. Our approach integrates HNO with advanced techniques such as network-based graph mining, matrix factorization, and Bayesian inference, specifically designed for imbalanced data scenarios. Constructing high-quality negative samples is crucial to mitigate the detrimental effects of noisy negative data and enhance model performance. Experimental results demonstrate the efficacy of our approach in enhancing the performance of drug discovery models by effectively managing data imbalance and refining the selection of negative samples. This methodology provides a robust framework for improving drug repositioning, with potential applications in broader biomedical domains.</p>","PeriodicalId":9065,"journal":{"name":"Bioinformatics and Biology Insights","volume":"19 ","pages":"11779322251328269"},"PeriodicalIF":2.3,"publicationDate":"2025-04-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12033409/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144061907","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-04-03eCollection Date: 2025-01-01DOI: 10.1177/11779322251324859
Md Habib Ullah Masum, Ahmad Abdullah Mahdeen, Abanti Barua
With the ability to cause massive epidemics that have consequences on millions of individuals globally, the Chikungunya virus (CHIKV) emerges as a severe menace. Developing an effective vaccine is urgent as no effective therapeutics are available for such viral infections. Therefore, we designed a novel mRNA vaccine against CHIKV with a combination of highly antigenic and potential MHC-I, MHC-II, and B-cell epitopes from the structural polyprotein. The vaccine demonstrated well-characterized physicochemical properties, indicating its solubility and potential functional stability within the body (GRAVY score of -0.639). Structural analyses of the vaccine revealed a well-stabilized secondary and tertiary structure (Ramachandran score of 82.8% and a Z-score of -4.17). Docking studies of the vaccine with TLR-2 (-1027.7 KJ/mol) and TLR-4 (-1212.4 KJ/mol) exhibited significant affinity with detailed hydrogen bond interactions. Molecular dynamics simulations highlighted distinct conformational dynamics among the vaccine, "vaccine-TLR-2" and "vaccine-TLR-4" complexes. The vaccine's ability to elicit both innate and adaptive immune responses, including the presence of memory B-cells and T-cells, persistent B-cell immunity for a year, and the activation of TH cells leading to the release of IFN-γ and IL-2, has significant implications for its potential effectiveness. The CHIKV vaccine developed in this study shows promise as a potential candidate for future vaccine production against CHIKV, suggesting its suitability for further clinical advancement, including in vitro and in vivo experiments.
{"title":"Revolutionizing Chikungunya Vaccines: mRNA Breakthroughs With Molecular and Immune Simulations.","authors":"Md Habib Ullah Masum, Ahmad Abdullah Mahdeen, Abanti Barua","doi":"10.1177/11779322251324859","DOIUrl":"10.1177/11779322251324859","url":null,"abstract":"<p><p>With the ability to cause massive epidemics that have consequences on millions of individuals globally, the Chikungunya virus (CHIKV) emerges as a severe menace. Developing an effective vaccine is urgent as no effective therapeutics are available for such viral infections. Therefore, we designed a novel mRNA vaccine against CHIKV with a combination of highly antigenic and potential MHC-I, MHC-II, and B-cell epitopes from the structural polyprotein. The vaccine demonstrated well-characterized physicochemical properties, indicating its solubility and potential functional stability within the body (GRAVY score of -0.639). Structural analyses of the vaccine revealed a well-stabilized secondary and tertiary structure (Ramachandran score of 82.8% and a Z-score of -4.17). Docking studies of the vaccine with TLR-2 (-1027.7 KJ/mol) and TLR-4 (-1212.4 KJ/mol) exhibited significant affinity with detailed hydrogen bond interactions. Molecular dynamics simulations highlighted distinct conformational dynamics among the vaccine, \"vaccine-TLR-2\" and \"vaccine-TLR-4\" complexes. The vaccine's ability to elicit both innate and adaptive immune responses, including the presence of memory B-cells and T-cells, persistent B-cell immunity for a year, and the activation of TH cells leading to the release of IFN-γ and IL-2, has significant implications for its potential effectiveness. The CHIKV vaccine developed in this study shows promise as a potential candidate for future vaccine production against CHIKV, suggesting its suitability for further clinical advancement, including in vitro and in vivo experiments.</p>","PeriodicalId":9065,"journal":{"name":"Bioinformatics and Biology Insights","volume":"19 ","pages":"11779322251324859"},"PeriodicalIF":2.3,"publicationDate":"2025-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11967231/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143779130","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-04-03eCollection Date: 2025-01-01DOI: 10.1177/11779322251324104
Michael Christian Gruber, Daniel Kummer, Katja Sallinger, Henderson James Cleaves, Arsev Umur Aydinoğlu, Thomas Kroneis
The Microchimerism Literature Atlas (MCLA) is a comprehensive online dataset to facilitate the investigation of microchimerism (MC), condition where individuals harbor cells from another individual of the same species. The MCLA provides access to more than 15 000 references from MC research, covering peer-reviewed articles and reviews from 1970 to the present. Key features include a multidimensional search function and logical operators for assembling search queries. The MCLA dataset offers a clearly structured data table view, combined with dynamic graphical data representation and visual citation analysis, aiding in the investigation and identification of research trends and patterns. The MCLA supports data export in various formats and receives regular updates. The MCLA is being developed as an essential resource for the MC research community while its framework is easily adaptable for custom literature datasets, enabling its use in other research fields.
{"title":"The Microchimerism Literature Atlas.","authors":"Michael Christian Gruber, Daniel Kummer, Katja Sallinger, Henderson James Cleaves, Arsev Umur Aydinoğlu, Thomas Kroneis","doi":"10.1177/11779322251324104","DOIUrl":"10.1177/11779322251324104","url":null,"abstract":"<p><p>The Microchimerism Literature Atlas (MCLA) is a comprehensive online dataset to facilitate the investigation of microchimerism (MC), condition where individuals harbor cells from another individual of the same species. The MCLA provides access to more than 15 000 references from MC research, covering peer-reviewed articles and reviews from 1970 to the present. Key features include a multidimensional search function and logical operators for assembling search queries. The MCLA dataset offers a clearly structured data table view, combined with dynamic graphical data representation and visual citation analysis, aiding in the investigation and identification of research trends and patterns. The MCLA supports data export in various formats and receives regular updates. The MCLA is being developed as an essential resource for the MC research community while its framework is easily adaptable for custom literature datasets, enabling its use in other research fields.</p>","PeriodicalId":9065,"journal":{"name":"Bioinformatics and Biology Insights","volume":"19 ","pages":"11779322251324104"},"PeriodicalIF":2.3,"publicationDate":"2025-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11967202/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143779132","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-28eCollection Date: 2025-01-01DOI: 10.1177/11779322251328302
Abdur Razzak, Otun Saha, Khandokar Fahmida Sultana, Mohammad Ruhul Amin, Abdullah Bin Zahid, Afroza Sultana, Uditi Paul Bristi, Sultana Rajia, Nikkon Sarker, Md Mizanur Rahaman, Newaz Mohammed Bahadur, Foysal Hossen
Shigellosis remains a major global health concern, particularly in regions with poor sanitation and limited access to clean water. This study used immunoinformatics and reverse vaccinology to design a potential mRNA vaccine targeting Shigella pathotypes out of 4071 proteins from Shigella sonnei str. Ss046, 4 key antigenic candidates were identified: putative outer membrane protein (Q3YZL0), PapC-like porin protein (Q3YZM5), putative fimbrial-like protein (Q3Z3I2), and lipopolysaccharide (LPS)-assembly protein LptD (Q3Z5V5), ensuring broad pathotype coverage. A multitope vaccine was designed incorporating cytotoxic T lymphocyte, helper T lymphocyte, and B-cell epitopes, linked with suitable linkers and adjuvants to enhance immunogenicity. Computational analyses predicted vaccine's favorable antigenicity, solubility, and stability, while molecular docking and dynamic simulations demonstrated strong binding affinity and stability with Toll-like receptor 4 (TLR-4), indicating potential for robust immune activation. Immune simulations predicted strong humoral and cellular immune responses, characterized by significant cytokine production and long-term immune memory. Structural evaluations of the complex, including radius of gyration, root mean square deviation, root mean square fluctuation, and solvent accessibility, confirmed the vaccine's structural integrity, and stability under physiological conditions. This research contributes to the ongoing effort to alleviate the global burden of Shigella infections, providing a foundation for future wet laboratory investigations aimed at vaccine development.
{"title":"Development of a Novel mRNA Vaccine Against <i>Shigella</i> Pathotypes Causing Widespread Shigellosis Endemic: An In-Silico Immunoinformatic Approach.","authors":"Abdur Razzak, Otun Saha, Khandokar Fahmida Sultana, Mohammad Ruhul Amin, Abdullah Bin Zahid, Afroza Sultana, Uditi Paul Bristi, Sultana Rajia, Nikkon Sarker, Md Mizanur Rahaman, Newaz Mohammed Bahadur, Foysal Hossen","doi":"10.1177/11779322251328302","DOIUrl":"10.1177/11779322251328302","url":null,"abstract":"<p><p>Shigellosis remains a major global health concern, particularly in regions with poor sanitation and limited access to clean water. This study used immunoinformatics and reverse vaccinology to design a potential mRNA vaccine targeting <i>Shigella</i> pathotypes out of 4071 proteins from <i>Shigella sonnei</i> str. Ss046, 4 key antigenic candidates were identified: putative outer membrane protein (Q3YZL0), PapC-like porin protein (Q3YZM5), putative fimbrial-like protein (Q3Z3I2), and lipopolysaccharide (LPS)-assembly protein LptD (Q3Z5V5), ensuring broad pathotype coverage. A multitope vaccine was designed incorporating cytotoxic T lymphocyte, helper T lymphocyte, and B-cell epitopes, linked with suitable linkers and adjuvants to enhance immunogenicity. Computational analyses predicted vaccine's favorable antigenicity, solubility, and stability, while molecular docking and dynamic simulations demonstrated strong binding affinity and stability with Toll-like receptor 4 (TLR-4), indicating potential for robust immune activation. Immune simulations predicted strong humoral and cellular immune responses, characterized by significant cytokine production and long-term immune memory. Structural evaluations of the complex, including radius of gyration, root mean square deviation, root mean square fluctuation, and solvent accessibility, confirmed the vaccine's structural integrity, and stability under physiological conditions. This research contributes to the ongoing effort to alleviate the global burden of <i>Shigella</i> infections, providing a foundation for future wet laboratory investigations aimed at vaccine development.</p>","PeriodicalId":9065,"journal":{"name":"Bioinformatics and Biology Insights","volume":"19 ","pages":"11779322251328302"},"PeriodicalIF":2.3,"publicationDate":"2025-03-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11951904/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143751080","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-27eCollection Date: 2025-01-01DOI: 10.1177/11779322251325390
Md Khairul Islam, Himanshu Wagh, Hairong Wei
The DyGAF model, which stands for Dynamic Gene Attention Focus, is specifically designed and tailored to address the challenges in biomarker detection, progression reporting of pathogen infection, and disease diagnostics. The DyGAF model introduced a novel dual-model attention-based mechanism within neural networks, combined with machine learning algorithms to enhance the process of biomarker identification. The model transcended traditional diagnostic approaches by meticulously analyzing gene expression data. DyGAF not only identified but also ranked genes based on their significance, revealing a comprehensive list of the top genes essential for disease detection and prognosis. In addition, KEGG pathways, Wiki Pathways, and Gene Ontology-based analyses provided a multileveled evaluation of the genes' roles. In our analyses, we tailored COVID-19 gene expression profile from nasopharyngeal swabs that offer a more nuanced view of the intricate interplay between the host and the virus. The genes ranked by the DyGAF model were compared against those selected by differential expression analysis and random forest feature selection methods for further validation of our model. DyGAF demonstrated its prowess in identifying important biomarkers that could enrich gene ontologies and pathways crucial for elucidating the pathogenesis of COVID-19. Furthermore, DyGAF was also employed for diagnosing COVID-19 patients by classifying gene-expression profiles with an accuracy of 94.23%. Benchmarking against other conventional models revealed DyGAF's superior performance, highlighting its effectiveness in identifying and categorizing COVID-19 cases. In summary, DyGAF model represents a significant advancement in genomic research, providing a more comprehensive and precise tool for identifying key genetic markers and unraveling the complex biological insights of a disease. The DyGAF model is available as a software package at the following link: https://github.com/hiddenntreasure/DyGAF.
{"title":"Dynamic Gene Attention Focus (DyGAF): Enhancing Biomarker Identification Through Dual-Model Attention Networks.","authors":"Md Khairul Islam, Himanshu Wagh, Hairong Wei","doi":"10.1177/11779322251325390","DOIUrl":"10.1177/11779322251325390","url":null,"abstract":"<p><p>The DyGAF model, which stands for Dynamic Gene Attention Focus, is specifically designed and tailored to address the challenges in biomarker detection, progression reporting of pathogen infection, and disease diagnostics. The DyGAF model introduced a novel dual-model attention-based mechanism within neural networks, combined with machine learning algorithms to enhance the process of biomarker identification. The model transcended traditional diagnostic approaches by meticulously analyzing gene expression data. DyGAF not only identified but also ranked genes based on their significance, revealing a comprehensive list of the top genes essential for disease detection and prognosis. In addition, KEGG pathways, Wiki Pathways, and Gene Ontology-based analyses provided a multileveled evaluation of the genes' roles. In our analyses, we tailored COVID-19 gene expression profile from nasopharyngeal swabs that offer a more nuanced view of the intricate interplay between the host and the virus. The genes ranked by the DyGAF model were compared against those selected by differential expression analysis and random forest feature selection methods for further validation of our model. DyGAF demonstrated its prowess in identifying important biomarkers that could enrich gene ontologies and pathways crucial for elucidating the pathogenesis of COVID-19. Furthermore, DyGAF was also employed for diagnosing COVID-19 patients by classifying gene-expression profiles with an accuracy of 94.23%. Benchmarking against other conventional models revealed DyGAF's superior performance, highlighting its effectiveness in identifying and categorizing COVID-19 cases. In summary, DyGAF model represents a significant advancement in genomic research, providing a more comprehensive and precise tool for identifying key genetic markers and unraveling the complex biological insights of a disease. The DyGAF model is available as a software package at the following link: https://github.com/hiddenntreasure/DyGAF.</p>","PeriodicalId":9065,"journal":{"name":"Bioinformatics and Biology Insights","volume":"19 ","pages":"11779322251325390"},"PeriodicalIF":2.3,"publicationDate":"2025-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11951896/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143751083","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-12eCollection Date: 2025-01-01DOI: 10.1177/11779322251321065
Sanni Översti, Ariane Weber, Viktor Baran, Bärbel Kieninger, Alexander Dilthey, Torsten Houwaart, Andreas Walker, Wulf Schneider-Brachert, Denise Kühnert
The importance of genomic surveillance strategies for pathogens has been particularly evident during the coronavirus disease 2019 (COVID-19) pandemic, as genomic data from the causative agent, severe acute respiratory syndrome coronavirus type 2 (SARS-CoV-2), have guided public health decisions worldwide. Bayesian phylodynamic inference, integrating epidemiology and evolutionary biology, has become an essential tool in genomic epidemiological surveillance. It enables the estimation of epidemiological parameters, such as the reproductive number, from pathogen sequence data alone. Despite the phylodynamic approach being widely adopted, the abundance of phylodynamic models often makes it challenging to select the appropriate model for specific research questions. This article illustrates the application of phylodynamic birth-death-sampling models in public health using genomic data, with a focus on SARS-CoV-2. Targeting researchers less familiar with phylodynamics, it introduces a comprehensive workflow, including the conceptualisation of a research study and detailed steps for data preprocessing and postprocessing. In addition, we demonstrate the versatility of birth-death-sampling models through three case studies from Germany, utilising the BEAST2 software and its model implementations. Each case study addresses a distinct research question relevant not only to SARS-CoV-2 but also to other pathogens: Case study 1 finds traces of a superspreading event at the start of an early outbreak, exemplifying how simple models for genomic data can provide information that would otherwise only be accessible through extensive contact tracing. Case study 2 compares transmission dynamics in a nosocomial outbreak to community transmission, highlighting distinct dynamics through integrative analysis. Case study 3 investigates whether local transmission patterns align with national trends, demonstrating how phylodynamic models can disentangle complex population substructure with little additional information. For each case study, we emphasise critical points where model assumptions and data properties may misalign and outline appropriate validation assessments. Overall, we aim to provide researchers with examples on using birth-death-sampling models in genomic epidemiology, balancing theoretical and practical aspects.
{"title":"Evolutionary and epidemic dynamics of COVID-19 in Germany exemplified by three Bayesian phylodynamic case studies.","authors":"Sanni Översti, Ariane Weber, Viktor Baran, Bärbel Kieninger, Alexander Dilthey, Torsten Houwaart, Andreas Walker, Wulf Schneider-Brachert, Denise Kühnert","doi":"10.1177/11779322251321065","DOIUrl":"10.1177/11779322251321065","url":null,"abstract":"<p><p>The importance of genomic surveillance strategies for pathogens has been particularly evident during the coronavirus disease 2019 (COVID-19) pandemic, as genomic data from the causative agent, severe acute respiratory syndrome coronavirus type 2 (SARS-CoV-2), have guided public health decisions worldwide. Bayesian phylodynamic inference, integrating epidemiology and evolutionary biology, has become an essential tool in genomic epidemiological surveillance. It enables the estimation of epidemiological parameters, such as the reproductive number, from pathogen sequence data alone. Despite the phylodynamic approach being widely adopted, the abundance of phylodynamic models often makes it challenging to select the appropriate model for specific research questions. This article illustrates the application of phylodynamic birth-death-sampling models in public health using genomic data, with a focus on SARS-CoV-2. Targeting researchers less familiar with phylodynamics, it introduces a comprehensive workflow, including the conceptualisation of a research study and detailed steps for data preprocessing and postprocessing. In addition, we demonstrate the versatility of birth-death-sampling models through three case studies from Germany, utilising the BEAST2 software and its model implementations. Each case study addresses a distinct research question relevant not only to SARS-CoV-2 but also to other pathogens: Case study 1 finds traces of a superspreading event at the start of an early outbreak, exemplifying how simple models for genomic data can provide information that would otherwise only be accessible through extensive contact tracing. Case study 2 compares transmission dynamics in a nosocomial outbreak to community transmission, highlighting distinct dynamics through integrative analysis. Case study 3 investigates whether local transmission patterns align with national trends, demonstrating how phylodynamic models can disentangle complex population substructure with little additional information. For each case study, we emphasise critical points where model assumptions and data properties may misalign and outline appropriate validation assessments. Overall, we aim to provide researchers with examples on using birth-death-sampling models in genomic epidemiology, balancing theoretical and practical aspects.</p>","PeriodicalId":9065,"journal":{"name":"Bioinformatics and Biology Insights","volume":"19 ","pages":"11779322251321065"},"PeriodicalIF":2.3,"publicationDate":"2025-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11898094/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143613195","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}