Implementasi Machine Learning Dengan Metode Text Mining Pada Twitter

Infotek : Jurnal Informatika dan Teknologi Pub Date : 2024-01-20 DOI:10.29408/jit.v7i1.23734

Hamdun Sulaiman, Muhamad Ryansyah, Kudiantoro Widianto, Sidik Sidik, Andria Nugraha

{"title":"Implementasi Machine Learning Dengan Metode Text Mining Pada Twitter","authors":"Hamdun Sulaiman, Muhamad Ryansyah, Kudiantoro Widianto, Sidik Sidik, Andria Nugraha","doi":"10.29408/jit.v7i1.23734","DOIUrl":null,"url":null,"abstract":"Currently PT. Telkom Indonesia (Indihome), uses the role of social media as a form of concern for its customers to handle complaints. Tweets from indihome customers on social media twitter are handled by the customer service division of Indihome. The manual of the categorization process carried out by the customer service division of Indihome on every narration of the \"complain\" complaint tweet that goes to @indihome twitter, makes the process considered inefficient. The purpose of this research is to provide solutions related to the problem of categorizing complaint tweets and to develop tools that can extract the narration of \"complain\" tweets in Indonesian. The research method used is comparative. On the other hand, gataframework and rapidminer tools are also used in this research to assist in preprocessing and cleaning of datasets to help create corpus and sentiment analysis. The total dataset after cleansing and preprocessing is 1,510. Based on the method proposed in this study on the Support Vector Machine classification algorithm, the highest category was found to have 82.42% accuracy, 75.33% precision, and 98.75% recall with an AUC of 0.826","PeriodicalId":13567,"journal":{"name":"Infotek : Jurnal Informatika dan Teknologi","volume":"3 9","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Infotek : Jurnal Informatika dan Teknologi","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.29408/jit.v7i1.23734","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Currently PT. Telkom Indonesia (Indihome), uses the role of social media as a form of concern for its customers to handle complaints. Tweets from indihome customers on social media twitter are handled by the customer service division of Indihome. The manual of the categorization process carried out by the customer service division of Indihome on every narration of the "complain" complaint tweet that goes to @indihome twitter, makes the process considered inefficient. The purpose of this research is to provide solutions related to the problem of categorizing complaint tweets and to develop tools that can extract the narration of "complain" tweets in Indonesian. The research method used is comparative. On the other hand, gataframework and rapidminer tools are also used in this research to assist in preprocessing and cleaning of datasets to help create corpus and sentiment analysis. The total dataset after cleansing and preprocessing is 1,510. Based on the method proposed in this study on the Support Vector Machine classification algorithm, the highest category was found to have 82.42% accuracy, 75.33% precision, and 98.75% recall with an AUC of 0.826

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

在 Twitter 上使用文本挖掘方法实施机器学习

目前，PT.Telkom Indonesia (Indihome)将社交媒体的作用作为一种关心客户的形式来处理投诉。Indihome 客户在社交媒体 twitter 上发布的推文由 Indihome 客户服务部门处理。Indihome 客户服务部门对 @indihome twitter 上的每一条 "投诉 "推文进行人工分类，这使得处理过程效率低下。本研究旨在提供与投诉推文分类问题相关的解决方案，并开发可提取印尼语 "投诉 "推文叙述的工具。采用的研究方法是比较法。另一方面，本研究还使用了 gataframework 和 rapidminer 工具来协助预处理和清理数据集，以帮助创建语料库和进行情感分析。经过清理和预处理后的数据集总数为 1,510 个。根据本研究提出的支持向量机分类算法，发现最高类别的准确率为 82.42%，精确率为 75.33%，召回率为 98.75%，AUC 为 0.826

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Infotek : Jurnal Informatika dan Teknologi

自引率

0.00%

发文量