{"title":"Lexicon-based emotion analysis in Turkish","authors":"Mansur Alp Toçoğlu, A. Alpkocak","doi":"10.3906/ELK-1807-41","DOIUrl":null,"url":null,"abstract":"In this paper, we proposed a lexicon for emotion analysis in Turkish for six emotional categories happiness, fear, anger, sadness, disgust, and surprise. Besides, we also investigated the effects of a lemmatizer and a stemmer, two term-weighting schemes, four lexicon enrichment methods, and a term selection approach for lexicon construction. To do this, we generated Turkish emotion lexicon based on a dataset, TREMO, containing 25,989 documents. We then preprocessed the documents to obtain dictionary and stem forms of each term using a lemmatizer and a stemmer. Afterwards, we proposed two different weighting schemes where term frequency, term-class frequency and mutual information (MI) values for six emotion categories are taken into consideration. We then enriched the lexicon by using bigram and concept hierarchy methods, and performed term selection for efficiency issues. Then, we compared the performance of lexicon-based approach with machine learning based approach by using our proposed lexicon. The experiments showed that the use of the proposed lexicon efficiently produces comparable results in emotion analysis in Turkish text.","PeriodicalId":49410,"journal":{"name":"Turkish Journal of Electrical Engineering and Computer Sciences","volume":"52 1","pages":""},"PeriodicalIF":1.2000,"publicationDate":"2019-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"13","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Turkish Journal of Electrical Engineering and Computer Sciences","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.3906/ELK-1807-41","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 13
Abstract
In this paper, we proposed a lexicon for emotion analysis in Turkish for six emotional categories happiness, fear, anger, sadness, disgust, and surprise. Besides, we also investigated the effects of a lemmatizer and a stemmer, two term-weighting schemes, four lexicon enrichment methods, and a term selection approach for lexicon construction. To do this, we generated Turkish emotion lexicon based on a dataset, TREMO, containing 25,989 documents. We then preprocessed the documents to obtain dictionary and stem forms of each term using a lemmatizer and a stemmer. Afterwards, we proposed two different weighting schemes where term frequency, term-class frequency and mutual information (MI) values for six emotion categories are taken into consideration. We then enriched the lexicon by using bigram and concept hierarchy methods, and performed term selection for efficiency issues. Then, we compared the performance of lexicon-based approach with machine learning based approach by using our proposed lexicon. The experiments showed that the use of the proposed lexicon efficiently produces comparable results in emotion analysis in Turkish text.
期刊介绍:
The Turkish Journal of Electrical Engineering & Computer Sciences is published electronically 6 times a year by the Scientific and Technological Research Council of Turkey (TÜBİTAK)
Accepts English-language manuscripts in the areas of power and energy, environmental sustainability and energy efficiency, electronics, industry applications, control systems, information and systems, applied electromagnetics, communications, signal and image processing, tomographic image reconstruction, face recognition, biometrics, speech processing, video processing and analysis, object recognition, classification, feature extraction, parallel and distributed computing, cognitive systems, interaction, robotics, digital libraries and content, personalized healthcare, ICT for mobility, sensors, and artificial intelligence.
Contribution is open to researchers of all nationalities.