{"title":"Improving Allergenic Protein Prediction Using Physicochemical Features on Non-Redundant Sequences","authors":"Sher Singh, Jr-Rou Chiu, Kuei-Ling Sun, E. C. Su","doi":"10.1109/ICMLC48188.2019.8949197","DOIUrl":null,"url":null,"abstract":"Despite extensive studies in allergen prediction, current approaches still have room for performance improvement and suffer from the problem of lack of interpretable biological features. Thus, developments of allergen prediction method from sequences have become highly important to facilitate in silico vaccine design. In this study, we propose a systematic approach to predict allergenic proteins by incorporating sequence and physicochemical properties in machine learning algorithms. In addition, predictive performance of previous studies could be overestimated due to high redundancy in the data sets. Therefore, we reduce sequence redundancy in the data set and experiment results show that we achieve better predictive performance when compared with other approaches. This study can help discover new prophylactic and therapeutic vaccines for diseases. Moreover, we analyze immunological features that can provide valuable insights into immunotherapies of allergy and autoimmune diseases in translational bioinformatics.","PeriodicalId":221349,"journal":{"name":"2019 International Conference on Machine Learning and Cybernetics (ICMLC)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 International Conference on Machine Learning and Cybernetics (ICMLC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICMLC48188.2019.8949197","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Despite extensive studies in allergen prediction, current approaches still have room for performance improvement and suffer from the problem of lack of interpretable biological features. Thus, developments of allergen prediction method from sequences have become highly important to facilitate in silico vaccine design. In this study, we propose a systematic approach to predict allergenic proteins by incorporating sequence and physicochemical properties in machine learning algorithms. In addition, predictive performance of previous studies could be overestimated due to high redundancy in the data sets. Therefore, we reduce sequence redundancy in the data set and experiment results show that we achieve better predictive performance when compared with other approaches. This study can help discover new prophylactic and therapeutic vaccines for diseases. Moreover, we analyze immunological features that can provide valuable insights into immunotherapies of allergy and autoimmune diseases in translational bioinformatics.