- Book学术

Jurnal Informatika Jurnal Pengembangan IT Pub Date : 2021-10-01 DOI:10.30591/JPIT.V6I3.3016

Dwi Intan Af’idah, Dairoh Dairoh, Sharfina Febbi Handayani, Riszki Wijayatun Pratiwi

{"title":"Pengaruh Parameter Word2Vec terhadap Performa Deep Learning pada Klasifikasi Sentimen","authors":"Dwi Intan Af’idah, Dairoh Dairoh, Sharfina Febbi Handayani, Riszki Wijayatun Pratiwi","doi":"10.30591/JPIT.V6I3.3016","DOIUrl":null,"url":null,"abstract":"The difficulty of sentiment classification on this big data can be overcome using deep learning. Before the deep learning training and testing process is carried out, a word features extraction process is needed. Word2Vec as a word features extraction is often used in sentiment classification pre-training because it can capture the semantic meaning of the text by representing a similar vector for each word that has a close meaning. Word2Vec has three parameters that affect the model learning process namely architecture, evaluation method, and dimensions. This study aims to determine the effect of each Word2Vec parameter on deep learning performance in sentiment classification. The accuracy results of the deep learning model were evaluated to determine the effect of the Word2Vec parameter. The results of this study indicate that the three Word2Vec parameters have an influence on the performance of the deep learning model in sentiment classification. The combination of Word2Vec parameters that produces the highest average accuracy include CBOW (Continuous Bag of Word) architecture, Hierarchical Softmax evaluation method, and a dimension of 100. CBOW produces better performance, because it has slightly better accuracy for words that often appear and in this research dataset there are many words that often appear. Hierarchical Softmax shows better results because it uses a binary tree model which makes words that occur rarely will inherit the vector representation above them. The dimension with a value of 100 produces better accuracy because it is in line with the number of datasets of 10,000 reviews. ","PeriodicalId":53375,"journal":{"name":"Jurnal Informatika Jurnal Pengembangan IT","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Jurnal Informatika Jurnal Pengembangan IT","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.30591/JPIT.V6I3.3016","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 6

摘要

使用深度学习可以克服在这种大数据上进行情绪分类的困难。在进行深度学习训练和测试过程之前，需要进行单词特征提取过程。Word2Vec作为一个词的特征提取通常用于情感分类预训练，因为它可以通过为每个具有相近含义的词表示相似的向量来捕获文本的语义。Word2Vec有三个影响模型学习过程的参数，即架构、评估方法和维度。本研究旨在确定每个Word2Vec参数对情绪分类中深度学习表现的影响。对深度学习模型的准确性结果进行了评估，以确定Word2Vec参数的影响。本研究的结果表明，Word2Vec的三个参数对深度学习模型在情感分类中的性能有影响。产生最高平均精度的Word2Verc参数的组合包括CBOW（单词的连续袋）架构、分层Softmax评估方法和100的维度。CBOW产生了更好的性能，因为它对经常出现的单词有更好的准确性，而且在这个研究数据集中有很多经常出现的词。分层Softmax显示出更好的结果，因为它使用了二叉树模型，使得很少出现的单词将继承其上面的向量表示。值为100的维度产生了更好的准确性，因为它与10000条评论的数据集数量一致。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Pengaruh Parameter Word2Vec terhadap Performa Deep Learning pada Klasifikasi Sentimen

The difficulty of sentiment classification on this big data can be overcome using deep learning. Before the deep learning training and testing process is carried out, a word features extraction process is needed. Word2Vec as a word features extraction is often used in sentiment classification pre-training because it can capture the semantic meaning of the text by representing a similar vector for each word that has a close meaning. Word2Vec has three parameters that affect the model learning process namely architecture, evaluation method, and dimensions. This study aims to determine the effect of each Word2Vec parameter on deep learning performance in sentiment classification. The accuracy results of the deep learning model were evaluated to determine the effect of the Word2Vec parameter. The results of this study indicate that the three Word2Vec parameters have an influence on the performance of the deep learning model in sentiment classification. The combination of Word2Vec parameters that produces the highest average accuracy include CBOW (Continuous Bag of Word) architecture, Hierarchical Softmax evaluation method, and a dimension of 100. CBOW produces better performance, because it has slightly better accuracy for words that often appear and in this research dataset there are many words that often appear. Hierarchical Softmax shows better results because it uses a binary tree model which makes words that occur rarely will inherit the vector representation above them. The dimension with a value of 100 produces better accuracy because it is in line with the number of datasets of 10,000 reviews.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊