Miran Hama Saeed Mohammed Amin, Omar Al-Rassam, Zhenar Shaho Faeq
{"title":"Kurdish Language Sentiment Analysis: Problems and Challenges","authors":"Miran Hama Saeed Mohammed Amin, Omar Al-Rassam, Zhenar Shaho Faeq","doi":"10.17762/msea.v71i4.890","DOIUrl":null,"url":null,"abstract":"The increasing usage of blogs, social networks, and forums for sharing opinions on a certain topic has created vast amounts of internet data. Therefore, Sentiment Analysis has gained great popularity among researchers and industry for analyzing the polarity of users' opinions. In recent years, Sentiment Analysis has been applied to various languages using machine learning-approach, corpus-based approach, and deep learning techniques since it is beneficial for creating an effective recommender system. The Kurdish Language is an Indo-European language, one of the official languages in Iraq, and it is also widely used in Turkey, Iran, and Syria. Although the importance of this Language is spoken by over 40 million people, to the best of our knowledge, no research has been done regarding the challenges and problems of Kurdish sentiment analysis. Our research aims to highlight the latest studies and examine the most critical challenges of applying sentiment analysis approaches to the Kurdish Language. The study includes determining each challenge in each step of sentiment analysis processing in the Kurdish Language. In addition, our proposed methodology that could help address most of these challenges is implementing a hybrid approach by combining machine learning and lexicon-based approaches to improve the proficiency of sentiment classification in the Kurdish Language.","PeriodicalId":37943,"journal":{"name":"Philippine Statistician","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2022-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Philippine Statistician","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.17762/msea.v71i4.890","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"Mathematics","Score":null,"Total":0}
引用次数: 1
Abstract
The increasing usage of blogs, social networks, and forums for sharing opinions on a certain topic has created vast amounts of internet data. Therefore, Sentiment Analysis has gained great popularity among researchers and industry for analyzing the polarity of users' opinions. In recent years, Sentiment Analysis has been applied to various languages using machine learning-approach, corpus-based approach, and deep learning techniques since it is beneficial for creating an effective recommender system. The Kurdish Language is an Indo-European language, one of the official languages in Iraq, and it is also widely used in Turkey, Iran, and Syria. Although the importance of this Language is spoken by over 40 million people, to the best of our knowledge, no research has been done regarding the challenges and problems of Kurdish sentiment analysis. Our research aims to highlight the latest studies and examine the most critical challenges of applying sentiment analysis approaches to the Kurdish Language. The study includes determining each challenge in each step of sentiment analysis processing in the Kurdish Language. In addition, our proposed methodology that could help address most of these challenges is implementing a hybrid approach by combining machine learning and lexicon-based approaches to improve the proficiency of sentiment classification in the Kurdish Language.
期刊介绍:
The Journal aims to provide a media for the dissemination of research by statisticians and researchers using statistical method in resolving their research problems. While a broad spectrum of topics will be entertained, those with original contribution to the statistical science or those that illustrates novel applications of statistics in solving real-life problems will be prioritized. The scope includes, but is not limited to the following topics: Official Statistics Computational Statistics Simulation Studies Mathematical Statistics Survey Sampling Statistics Education Time Series Analysis Biostatistics Nonparametric Methods Experimental Designs and Analysis Econometric Theory and Applications Other Applications