Aim: The Insilco study uses deep learning algorithms to predict the protein-coding pg m RNA sequences.
Material and methods: The NCBI GEO DATA SET GSE218606's GEO R tool discovered P.G's outer membrane vesicles' most differentially expressed mRNA. Genemania analyzed differentially expressed gene networks. Transcriptomics data were collected and labeled on P. gingivalis protein-coding mRNA sequence and pseudogene, lincRNA, and bidirectional promoter lincRNA. Orange, a machine learning tool, analyzed and predicted data after preprocessing. Naïve Bayes, neural networks, and gradient descent partition data into training and testing sets, yielding accurate results. Cross-validation, model accuracy, and ROC curve were evaluated after model validation.
Results: Three models, Neural Networks, Naive Bayes, and Gradient Boosting, were evaluated using metrics like Area Under the Curve (AUC), Classification Accuracy (CA), F1 Score, Precision, Recall, and Specificity. Gradient Boosting achieved a balanced performance (AUC: 0.72, CA: 0.41, F1: 0.32) compared to Neural Networks (AUC: 0.721, CA: 0.391, F1: 0.314) and Naive Bayes (AUC: 0.701, CA: 0.172, F1: 0.114). While statistical tests revealed no significant differences between the models, Gradient Boosting exhibited a more balanced precision-recall relationship.
Conclusion: In silico analysis using machine learning techniques successfully predicted protein-coding mRNA sequences within Porphyromonas gingivalis OMVs. Gradient Boosting outperformed other models (Neural Networks, Naive Bayes) by achieving a balanced performance across metrics like AUC, classification accuracy, and precision-recall, suggests its potential as a reliable tool for protein-coding mRNA prediction in P. gingivalis OMVs.
扫码关注我们
求助内容:
应助结果提醒方式:
