亚马逊评论的情感分析和文档表示方法比较

Katic Tamara, Nemanja Milićević
{"title":"亚马逊评论的情感分析和文档表示方法比较","authors":"Katic Tamara, Nemanja Milićević","doi":"10.1109/SISY.2018.8524814","DOIUrl":null,"url":null,"abstract":"In the last few years sentiment analysis has made much progress. Sentiment analysis has been used in several applications to identify the opinions of people, products, brands, services, etc., which can, for example, improve a company's business. Some of these applications claim to have more effective document representation models than merely Information Retrieval approaches like the bag-of-words representation. Document representation models have increased interest to solve some of the limitations that bag-of-words representation has. In this paper, the several sentiment analysis and document representation methods of Amazon reviews are compared. In this paper, traditional models such as a bag-of-words, bag-of-ngrams and their TF-IDF variants combined with linear classifiers such as Logistic Regression and SVM, and deep learning models such as word-based convolutional neural networks (ConvNets) and the simple long short-term memory (LSTM) recurrent neural network were used. Various document representation techniques such as Paragraph Vector or using pre-trained Word2Vec and Glove word embeddings to compute the vector for each word in the document were tested, and word vectors are aggregated using the element-wise mean. It is shown that deep learning models perform better on our large dataset than traditional models. LSTM resulted with the best accuracy of 95.55%. Deep learning models generally work better than traditional models as training set size increases. Our best performing model can be used for automatic sentiment classification for future product reviews in retail stores.","PeriodicalId":6647,"journal":{"name":"2018 IEEE 16th International Symposium on Intelligent Systems and Informatics (SISY)","volume":"33 1","pages":"000283-000286"},"PeriodicalIF":0.0000,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":"{\"title\":\"Comparing Sentiment Analysis and Document Representation Methods of Amazon Reviews\",\"authors\":\"Katic Tamara, Nemanja Milićević\",\"doi\":\"10.1109/SISY.2018.8524814\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In the last few years sentiment analysis has made much progress. Sentiment analysis has been used in several applications to identify the opinions of people, products, brands, services, etc., which can, for example, improve a company's business. Some of these applications claim to have more effective document representation models than merely Information Retrieval approaches like the bag-of-words representation. Document representation models have increased interest to solve some of the limitations that bag-of-words representation has. In this paper, the several sentiment analysis and document representation methods of Amazon reviews are compared. In this paper, traditional models such as a bag-of-words, bag-of-ngrams and their TF-IDF variants combined with linear classifiers such as Logistic Regression and SVM, and deep learning models such as word-based convolutional neural networks (ConvNets) and the simple long short-term memory (LSTM) recurrent neural network were used. Various document representation techniques such as Paragraph Vector or using pre-trained Word2Vec and Glove word embeddings to compute the vector for each word in the document were tested, and word vectors are aggregated using the element-wise mean. It is shown that deep learning models perform better on our large dataset than traditional models. LSTM resulted with the best accuracy of 95.55%. Deep learning models generally work better than traditional models as training set size increases. Our best performing model can be used for automatic sentiment classification for future product reviews in retail stores.\",\"PeriodicalId\":6647,\"journal\":{\"name\":\"2018 IEEE 16th International Symposium on Intelligent Systems and Informatics (SISY)\",\"volume\":\"33 1\",\"pages\":\"000283-000286\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"12\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 IEEE 16th International Symposium on Intelligent Systems and Informatics (SISY)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SISY.2018.8524814\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE 16th International Symposium on Intelligent Systems and Informatics (SISY)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SISY.2018.8524814","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 12

摘要

在过去的几年里,情绪分析取得了很大的进展。情感分析已经在几个应用程序中用于识别人、产品、品牌、服务等的意见,例如,可以改善公司的业务。其中一些应用程序声称具有比诸如词袋表示之类的信息检索方法更有效的文档表示模型。文档表示模型对解决词袋表示的一些限制越来越感兴趣。本文对亚马逊评论的几种情感分析和文档表示方法进行了比较。本文将传统的词袋、图袋及其TF-IDF变体模型与线性分类器(如Logistic回归和SVM)和深度学习模型(如基于词的卷积神经网络(ConvNets)和简单长短期记忆(LSTM)递归神经网络)相结合。测试了各种文档表示技术,如段落向量或使用预训练的Word2Vec和Glove词嵌入来计算文档中每个词的向量,并使用元素平均聚合词向量。研究表明,深度学习模型在我们的大数据集上比传统模型表现得更好。LSTM的准确率最高,为95.55%。随着训练集规模的增加,深度学习模型通常比传统模型工作得更好。我们表现最好的模型可以用于未来零售商店产品评论的自动情感分类。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Comparing Sentiment Analysis and Document Representation Methods of Amazon Reviews
In the last few years sentiment analysis has made much progress. Sentiment analysis has been used in several applications to identify the opinions of people, products, brands, services, etc., which can, for example, improve a company's business. Some of these applications claim to have more effective document representation models than merely Information Retrieval approaches like the bag-of-words representation. Document representation models have increased interest to solve some of the limitations that bag-of-words representation has. In this paper, the several sentiment analysis and document representation methods of Amazon reviews are compared. In this paper, traditional models such as a bag-of-words, bag-of-ngrams and their TF-IDF variants combined with linear classifiers such as Logistic Regression and SVM, and deep learning models such as word-based convolutional neural networks (ConvNets) and the simple long short-term memory (LSTM) recurrent neural network were used. Various document representation techniques such as Paragraph Vector or using pre-trained Word2Vec and Glove word embeddings to compute the vector for each word in the document were tested, and word vectors are aggregated using the element-wise mean. It is shown that deep learning models perform better on our large dataset than traditional models. LSTM resulted with the best accuracy of 95.55%. Deep learning models generally work better than traditional models as training set size increases. Our best performing model can be used for automatic sentiment classification for future product reviews in retail stores.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Digital Forensics: Evidence Analysis via Intelligent Systems and Practices DigForASP - CA17124. Challenges and Achievements: Plenary Talk Kinematic quantification of knee joint asymmetry during preparatory phase of a standing backward tucked salto Enhanced Data Modelling Approach with Interval Estimation Cybersecurity Issues in Industrial Control Systems Fuzzy Based Indoor Navigation for Mobile Robots
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1