Nazerke Sultanova, K. Kozhakhmet, R. Jantayev, Azhar Botbayeva
{"title":"Stemming algorithm for Kazakh Language using rule-based approach","authors":"Nazerke Sultanova, K. Kozhakhmet, R. Jantayev, Azhar Botbayeva","doi":"10.1109/ICECCO48375.2019.9043253","DOIUrl":null,"url":null,"abstract":"This paper considers construction of Kazakh language words. Kazakh language is the type of Turkic languages or alternatively called – agglutinative language where words are formed by attaching the finite sequence of suffixes. Kazakh morphology investigation was the main work done in construction of rule-based stemming algorithm. Stemmer algorithm will be useful in development of sentiment analysis for Kazakh language, because of its rareness and competency. Stemming algorithm works with aim to cut ending combinations. The results checked by using Machine Learning algorithms using annotated dataset, which were retrieved from KazCorpus dataset [7].","PeriodicalId":166322,"journal":{"name":"2019 15th International Conference on Electronics, Computer and Computation (ICECCO)","volume":"49 25","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 15th International Conference on Electronics, Computer and Computation (ICECCO)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICECCO48375.2019.9043253","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
This paper considers construction of Kazakh language words. Kazakh language is the type of Turkic languages or alternatively called – agglutinative language where words are formed by attaching the finite sequence of suffixes. Kazakh morphology investigation was the main work done in construction of rule-based stemming algorithm. Stemmer algorithm will be useful in development of sentiment analysis for Kazakh language, because of its rareness and competency. Stemming algorithm works with aim to cut ending combinations. The results checked by using Machine Learning algorithms using annotated dataset, which were retrieved from KazCorpus dataset [7].