Nor Rahayu Ngatirin, Z. Zainol, Tan Lee Chee Yoong
{"title":"A comparative study of different classifiers for automatic personality prediction","authors":"Nor Rahayu Ngatirin, Z. Zainol, Tan Lee Chee Yoong","doi":"10.1109/ICCSCE.2016.7893613","DOIUrl":null,"url":null,"abstract":"Personality is described as a fairly fixed feature of an individual which indicates individual's preferences. Personality has been shown to be relevant to many types of interactions such as in predicting movie preferences, social relationships, personality and music, and correlation between personality and job performance. Predicting personality from social media become the current trend as the information extracted can be utilized to improve the users' experiences with various computerized interfaces. Thus, many algorithms have been performed to predict personality from social media. In this paper, we compared the performance of several classifiers provided in WEKA namely Bayes, Functions, Rules, Trees, and Meta in predicting student's personality. Based on adopted framework, the profile data of undergraduate students were extracted from Twitter, analyzed, and then classified in the automatic personality prediction. Four features with significant correlation from the profile data have been selected to map into Big Five personality model. Only extraversion dimension of the Big Five was considered in this study. A 10-fold cross validation was used to evaluate the classifiers. Several parameters that were observed in the performance of the classifiers are classification accuracy, F-measure, time taken to build the model, Kappa statistic, and training errors. Experimental evaluation demonstrated that OneR algorithm is the best classifier in terms of the accuracy, F-measure, and Kappa statistic.","PeriodicalId":6540,"journal":{"name":"2016 6th IEEE International Conference on Control System, Computing and Engineering (ICCSCE)","volume":"6 1","pages":"435-440"},"PeriodicalIF":0.0000,"publicationDate":"2016-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"20","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 6th IEEE International Conference on Control System, Computing and Engineering (ICCSCE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCSCE.2016.7893613","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 20
Abstract
Personality is described as a fairly fixed feature of an individual which indicates individual's preferences. Personality has been shown to be relevant to many types of interactions such as in predicting movie preferences, social relationships, personality and music, and correlation between personality and job performance. Predicting personality from social media become the current trend as the information extracted can be utilized to improve the users' experiences with various computerized interfaces. Thus, many algorithms have been performed to predict personality from social media. In this paper, we compared the performance of several classifiers provided in WEKA namely Bayes, Functions, Rules, Trees, and Meta in predicting student's personality. Based on adopted framework, the profile data of undergraduate students were extracted from Twitter, analyzed, and then classified in the automatic personality prediction. Four features with significant correlation from the profile data have been selected to map into Big Five personality model. Only extraversion dimension of the Big Five was considered in this study. A 10-fold cross validation was used to evaluate the classifiers. Several parameters that were observed in the performance of the classifiers are classification accuracy, F-measure, time taken to build the model, Kappa statistic, and training errors. Experimental evaluation demonstrated that OneR algorithm is the best classifier in terms of the accuracy, F-measure, and Kappa statistic.