Wegdan A. Hussien, Yahya M. Tashtoush, M. Al-Ayyoub, M. Al-Kabi
{"title":"表情符号是否足以训练阿拉伯语推文的情感分类器?","authors":"Wegdan A. Hussien, Yahya M. Tashtoush, M. Al-Ayyoub, M. Al-Kabi","doi":"10.1109/CSIT.2016.7549459","DOIUrl":null,"url":null,"abstract":"Nowadays, the automatic detection of emotions is employed by many applications across different fields like security informatics, e-learning, humor detection, targeted advertising, etc. Many of these applications focus on social media. In this study, we address the problem of emotion detection in Arabic tweets. We focus on the supervised approach for this problem where a classifier is trained on an already labeled dataset. Typically, such a training set is manually annotated, which is expensive and time consuming. We propose to use an automatic approach to annotate the training data based on using emojis, which are a new generation of emoticons. We show that such an approach produces classifiers that are more accurate than the ones trained on a manually annotated dataset. To achieve our goal, a dataset of emotional Arabic tweets is constructed, where the emotion classes under consideration are: anger, disgust, joy and sadness. Moreover, we consider two classifiers: Support Vector Machine (SVM) and Multinomial Naive Bayes (MNB). The results of the tests show that the automatic labeling approaches using SVM and MNB outperform manual labeling approaches.","PeriodicalId":210905,"journal":{"name":"2016 7th International Conference on Computer Science and Information Technology (CSIT)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2016-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"48","resultStr":"{\"title\":\"Are emoticons good enough to train emotion classifiers of Arabic tweets?\",\"authors\":\"Wegdan A. Hussien, Yahya M. Tashtoush, M. Al-Ayyoub, M. Al-Kabi\",\"doi\":\"10.1109/CSIT.2016.7549459\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Nowadays, the automatic detection of emotions is employed by many applications across different fields like security informatics, e-learning, humor detection, targeted advertising, etc. Many of these applications focus on social media. In this study, we address the problem of emotion detection in Arabic tweets. We focus on the supervised approach for this problem where a classifier is trained on an already labeled dataset. Typically, such a training set is manually annotated, which is expensive and time consuming. We propose to use an automatic approach to annotate the training data based on using emojis, which are a new generation of emoticons. We show that such an approach produces classifiers that are more accurate than the ones trained on a manually annotated dataset. To achieve our goal, a dataset of emotional Arabic tweets is constructed, where the emotion classes under consideration are: anger, disgust, joy and sadness. Moreover, we consider two classifiers: Support Vector Machine (SVM) and Multinomial Naive Bayes (MNB). The results of the tests show that the automatic labeling approaches using SVM and MNB outperform manual labeling approaches.\",\"PeriodicalId\":210905,\"journal\":{\"name\":\"2016 7th International Conference on Computer Science and Information Technology (CSIT)\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-07-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"48\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 7th International Conference on Computer Science and Information Technology (CSIT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CSIT.2016.7549459\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 7th International Conference on Computer Science and Information Technology (CSIT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CSIT.2016.7549459","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Are emoticons good enough to train emotion classifiers of Arabic tweets?
Nowadays, the automatic detection of emotions is employed by many applications across different fields like security informatics, e-learning, humor detection, targeted advertising, etc. Many of these applications focus on social media. In this study, we address the problem of emotion detection in Arabic tweets. We focus on the supervised approach for this problem where a classifier is trained on an already labeled dataset. Typically, such a training set is manually annotated, which is expensive and time consuming. We propose to use an automatic approach to annotate the training data based on using emojis, which are a new generation of emoticons. We show that such an approach produces classifiers that are more accurate than the ones trained on a manually annotated dataset. To achieve our goal, a dataset of emotional Arabic tweets is constructed, where the emotion classes under consideration are: anger, disgust, joy and sadness. Moreover, we consider two classifiers: Support Vector Machine (SVM) and Multinomial Naive Bayes (MNB). The results of the tests show that the automatic labeling approaches using SVM and MNB outperform manual labeling approaches.