{"title":"基于印尼语新闻分类的推文自动分类","authors":"Jaka E. Sembodo, E. B. Setiawan, M. Bijaksana","doi":"10.1109/ICOICT.2018.8528788","DOIUrl":null,"url":null,"abstract":"Tweet is being informative as well as news articles, so that the automatic tweet classifier based on news category could be useful to make ease in searching tweet based on certain interesting category. We identified those are 11 categories: religion, business, entertainment, law and crime, health, motivation, sport, government, education, politics and technology. In the learning process, we use ZeroR, Naive Bayes Multinomial (NBM), Support Vector Machine (SVM), Random Forest (RF) and Sequential Minimal Optimization (SMO) algorithm based on previous work that has similar topic with this paper. In experiments, we experiment classifier using all tweet and various maximum number of tweets and terms in each category. In evaluating performance system, we used 10-fold cross validation and use accuracy (correctly classified instances) as performance paramater. In the experiments result, NBM performs the highest performance with 77,47% accuracy with maximum number of tweets and terms in every category is 500 tweets and 1000 terms. At the last, we built automatic tweet classifier with NBM due to this classifier and experiment result perform the best performances using web-based programming.","PeriodicalId":266335,"journal":{"name":"2018 6th International Conference on Information and Communication Technology (ICoICT)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2018-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"Automatic Tweet Classification Based on News Category in Indonesian Language\",\"authors\":\"Jaka E. Sembodo, E. B. Setiawan, M. Bijaksana\",\"doi\":\"10.1109/ICOICT.2018.8528788\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Tweet is being informative as well as news articles, so that the automatic tweet classifier based on news category could be useful to make ease in searching tweet based on certain interesting category. We identified those are 11 categories: religion, business, entertainment, law and crime, health, motivation, sport, government, education, politics and technology. In the learning process, we use ZeroR, Naive Bayes Multinomial (NBM), Support Vector Machine (SVM), Random Forest (RF) and Sequential Minimal Optimization (SMO) algorithm based on previous work that has similar topic with this paper. In experiments, we experiment classifier using all tweet and various maximum number of tweets and terms in each category. In evaluating performance system, we used 10-fold cross validation and use accuracy (correctly classified instances) as performance paramater. In the experiments result, NBM performs the highest performance with 77,47% accuracy with maximum number of tweets and terms in every category is 500 tweets and 1000 terms. At the last, we built automatic tweet classifier with NBM due to this classifier and experiment result perform the best performances using web-based programming.\",\"PeriodicalId\":266335,\"journal\":{\"name\":\"2018 6th International Conference on Information and Communication Technology (ICoICT)\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-05-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 6th International Conference on Information and Communication Technology (ICoICT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICOICT.2018.8528788\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 6th International Conference on Information and Communication Technology (ICoICT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICOICT.2018.8528788","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Automatic Tweet Classification Based on News Category in Indonesian Language
Tweet is being informative as well as news articles, so that the automatic tweet classifier based on news category could be useful to make ease in searching tweet based on certain interesting category. We identified those are 11 categories: religion, business, entertainment, law and crime, health, motivation, sport, government, education, politics and technology. In the learning process, we use ZeroR, Naive Bayes Multinomial (NBM), Support Vector Machine (SVM), Random Forest (RF) and Sequential Minimal Optimization (SMO) algorithm based on previous work that has similar topic with this paper. In experiments, we experiment classifier using all tweet and various maximum number of tweets and terms in each category. In evaluating performance system, we used 10-fold cross validation and use accuracy (correctly classified instances) as performance paramater. In the experiments result, NBM performs the highest performance with 77,47% accuracy with maximum number of tweets and terms in every category is 500 tweets and 1000 terms. At the last, we built automatic tweet classifier with NBM due to this classifier and experiment result perform the best performances using web-based programming.