{"title":"A Novel Method to Identify Golgi Protein Types Based on Hybrid Feature and SVM Algorithm","authors":"Liang Ma, Hailin Jiang, Wanli Yang, Quanjie Zhu","doi":"10.1142/s1469026820500273","DOIUrl":null,"url":null,"abstract":"Accurate identification of Golgi protein types can provide useful clues to reveal the correlation between GA dysfunction and disease pathology and improve the ability to develop more effective treatments for the diseases. This paper introduces an effective and robust method to classify Golgi protein type with traditional machine learning algorithms. In which various features such as n-GDip, DCCA, psePSSM were used as training features and SVM with linear kernel was employed as a classifier. To solve the imbalance problem of the benchmark datasets, the oversampling technique SMOTE was adopted. To deal with the huge amount of features, the PCA algorithm and Fisher feature selection method were adopted to reduce feature dimensions and remove redundant features. The experimental results show that the proposed method had a further improvement compared with other traditional machine learning methods in 10-fold cross-validation, Jackknife cross-validation and independent testing, which means a further step for the clinical application of computational methods to predict the Golgi protein types.","PeriodicalId":422521,"journal":{"name":"Int. J. Comput. Intell. Appl.","volume":"102 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Int. J. Comput. Intell. Appl.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1142/s1469026820500273","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Accurate identification of Golgi protein types can provide useful clues to reveal the correlation between GA dysfunction and disease pathology and improve the ability to develop more effective treatments for the diseases. This paper introduces an effective and robust method to classify Golgi protein type with traditional machine learning algorithms. In which various features such as n-GDip, DCCA, psePSSM were used as training features and SVM with linear kernel was employed as a classifier. To solve the imbalance problem of the benchmark datasets, the oversampling technique SMOTE was adopted. To deal with the huge amount of features, the PCA algorithm and Fisher feature selection method were adopted to reduce feature dimensions and remove redundant features. The experimental results show that the proposed method had a further improvement compared with other traditional machine learning methods in 10-fold cross-validation, Jackknife cross-validation and independent testing, which means a further step for the clinical application of computational methods to predict the Golgi protein types.