基于支持向量机和查询扩展排序的印尼Twitter数据情感分析

JOIN Jurnal Online Informatika Pub Date : 2022-06-30 DOI:10.15575/join.v7i1.669

Hasbi Atsqalani, Nur Hayatin, Christian Sri Kusuma Aditya

{"title":"基于支持向量机和查询扩展排序的印尼Twitter数据情感分析","authors":"Hasbi Atsqalani, Nur Hayatin, Christian Sri Kusuma Aditya","doi":"10.15575/join.v7i1.669","DOIUrl":null,"url":null,"abstract":"Sentiment analysis is a computational study of a sentiment opinion and an overflow of feelings expressed in textual form. Twitter has become a popular social network among Indonesians. As a public figure running for president of Indonesia, public opinion is very important to see and consider the popularity of a presidential candidate. Media has become one of the important tools used to increase electability. However, it is not easy to analyze sentiments from tweets on Twitter apps, because it contains unstructured text, especially Indonesian text. The purpose of this research is to classify Indonesian twitter data into positive and negative sentiments polarity using Support Vector Machine and Query Expansion Ranking so that the information contained therein can be extracted and from the observed data can provide useful information for those in need. Several stages in the research include Crawling Data, Data Preprocessing, Term Frequency – Inverse Document Frequency (TF-IDF), Feature Selection Query Expansion Ranking, and data classification using the Support Vector Machine (SVM) method. To find out the performance of this classification process, it will be entered into a configuration matrix. By using a discussion matrix, the results show that calcification using the proposed reached accuracy and F-measure score in 77% and 68% respectively.","PeriodicalId":32019,"journal":{"name":"JOIN Jurnal Online Informatika","volume":"37 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2022-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Sentiment Analysis from Indonesian Twitter Data Using Support Vector Machine And Query Expansion Ranking\",\"authors\":\"Hasbi Atsqalani, Nur Hayatin, Christian Sri Kusuma Aditya\",\"doi\":\"10.15575/join.v7i1.669\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Sentiment analysis is a computational study of a sentiment opinion and an overflow of feelings expressed in textual form. Twitter has become a popular social network among Indonesians. As a public figure running for president of Indonesia, public opinion is very important to see and consider the popularity of a presidential candidate. Media has become one of the important tools used to increase electability. However, it is not easy to analyze sentiments from tweets on Twitter apps, because it contains unstructured text, especially Indonesian text. The purpose of this research is to classify Indonesian twitter data into positive and negative sentiments polarity using Support Vector Machine and Query Expansion Ranking so that the information contained therein can be extracted and from the observed data can provide useful information for those in need. Several stages in the research include Crawling Data, Data Preprocessing, Term Frequency – Inverse Document Frequency (TF-IDF), Feature Selection Query Expansion Ranking, and data classification using the Support Vector Machine (SVM) method. To find out the performance of this classification process, it will be entered into a configuration matrix. By using a discussion matrix, the results show that calcification using the proposed reached accuracy and F-measure score in 77% and 68% respectively.\",\"PeriodicalId\":32019,\"journal\":{\"name\":\"JOIN Jurnal Online Informatika\",\"volume\":\"37 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-06-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"JOIN Jurnal Online Informatika\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.15575/join.v7i1.669\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"JOIN Jurnal Online Informatika","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.15575/join.v7i1.669","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

情感分析是一种对情感、观点和以文本形式表达的情感的计算研究。推特已经成为印尼人喜爱的社交网络。作为一名竞选印尼总统的公众人物，公众舆论对观察和考虑总统候选人的受欢迎程度非常重要。媒体已成为提高可选性的重要工具之一。然而，从Twitter应用程序上的推文中分析情绪并不容易，因为它包含非结构化文本，尤其是印度尼西亚文本。本研究的目的是利用支持向量机和查询扩展排序将印尼twitter数据分类为积极和消极情绪极性，以便提取其中包含的信息，并从观察到的数据中为有需要的人提供有用的信息。研究的几个阶段包括数据爬行、数据预处理、词频-逆文档频率(TF-IDF)、特征选择查询扩展排序和使用支持向量机(SVM)方法进行数据分类。为了了解该分类过程的性能，将其输入到配置矩阵中。通过使用讨论矩阵，结果表明，使用所提出的钙化分别达到77%和68%的准确性和F-measure得分。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Sentiment Analysis from Indonesian Twitter Data Using Support Vector Machine And Query Expansion Ranking

Sentiment analysis is a computational study of a sentiment opinion and an overflow of feelings expressed in textual form. Twitter has become a popular social network among Indonesians. As a public figure running for president of Indonesia, public opinion is very important to see and consider the popularity of a presidential candidate. Media has become one of the important tools used to increase electability. However, it is not easy to analyze sentiments from tweets on Twitter apps, because it contains unstructured text, especially Indonesian text. The purpose of this research is to classify Indonesian twitter data into positive and negative sentiments polarity using Support Vector Machine and Query Expansion Ranking so that the information contained therein can be extracted and from the observed data can provide useful information for those in need. Several stages in the research include Crawling Data, Data Preprocessing, Term Frequency – Inverse Document Frequency (TF-IDF), Feature Selection Query Expansion Ranking, and data classification using the Support Vector Machine (SVM) method. To find out the performance of this classification process, it will be entered into a configuration matrix. By using a discussion matrix, the results show that calcification using the proposed reached accuracy and F-measure score in 77% and 68% respectively.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

JOIN Jurnal Online Informatika

自引率

0.00%

发文量

审稿时长

12 weeks