{"title":"Development of COVID-19 mRNA Vaccine Degradation Prediction System","authors":"Soon Hwai Ing, A. Abdullah, Shigehiko Kanaya","doi":"10.1109/3ICT53449.2021.9582052","DOIUrl":null,"url":null,"abstract":"The threatening Coronavirus which was assigned as the global pandemic concussed not only the public health but society, economy and every walks of life. Some measurements are taken to stifle the spread and one of the best ways is to carry out some precautions to prevent the contagion of SARS-CoV-2 virus to uninfected populaces. Injecting prevention vaccines is one of the precaution steps under the grandiose blueprint. Among all vaccines, it is found that mRNA vaccine which shows no side effect with marvelous effectiveness is the most preferable candidates to be considered. However, degradation had become its biggest drawback to be implemented. Hereby, this study is held with desideratum to develop prediction models specifically to predict the degradation rate of mRNA vaccine for COVID-19.3 machine learning algorithms, which are, Linear Regression (LR), Light Gradient Boosting Machine (LGBM) and Random Forest (RF) are proposed for 12 models development. Dataset comprises of thousands of RNA molecules that holds degradation rates at each position from Eterna platform is extracted, pre-processed and encoded with label encoding before loaded into algorithms. The results show that the LGBM-based model which is trained along with auxiliary bpps features and encoded with method 1 label encoding performs the best (RMSE = 0.24466), followed by the same criteria LGBM-based model but encoded with label encoding method 2, with a difference in 0.00003 in tow the topnotch model. The RF-based model with applaudable performance (RMSE = 0.25302) even without the ubieties of the riddled bpps features in contradistinction to the training and encoding criteria of the superb mellowed LGBM-based model is worth being further cultivated for the prediction study on COVID-19 mRNA vaccines' degradation rate.","PeriodicalId":133021,"journal":{"name":"2021 International Conference on Innovation and Intelligence for Informatics, Computing, and Technologies (3ICT)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-09-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 International Conference on Innovation and Intelligence for Informatics, Computing, and Technologies (3ICT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/3ICT53449.2021.9582052","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
The threatening Coronavirus which was assigned as the global pandemic concussed not only the public health but society, economy and every walks of life. Some measurements are taken to stifle the spread and one of the best ways is to carry out some precautions to prevent the contagion of SARS-CoV-2 virus to uninfected populaces. Injecting prevention vaccines is one of the precaution steps under the grandiose blueprint. Among all vaccines, it is found that mRNA vaccine which shows no side effect with marvelous effectiveness is the most preferable candidates to be considered. However, degradation had become its biggest drawback to be implemented. Hereby, this study is held with desideratum to develop prediction models specifically to predict the degradation rate of mRNA vaccine for COVID-19.3 machine learning algorithms, which are, Linear Regression (LR), Light Gradient Boosting Machine (LGBM) and Random Forest (RF) are proposed for 12 models development. Dataset comprises of thousands of RNA molecules that holds degradation rates at each position from Eterna platform is extracted, pre-processed and encoded with label encoding before loaded into algorithms. The results show that the LGBM-based model which is trained along with auxiliary bpps features and encoded with method 1 label encoding performs the best (RMSE = 0.24466), followed by the same criteria LGBM-based model but encoded with label encoding method 2, with a difference in 0.00003 in tow the topnotch model. The RF-based model with applaudable performance (RMSE = 0.25302) even without the ubieties of the riddled bpps features in contradistinction to the training and encoding criteria of the superb mellowed LGBM-based model is worth being further cultivated for the prediction study on COVID-19 mRNA vaccines' degradation rate.