基于词典的电影评论推文情感分析

A. Azizan, Nurul Najwa SK Abdul Jamal, M. N. Abdullah, Masurah Mohamad, N. Khairudin
{"title":"基于词典的电影评论推文情感分析","authors":"A. Azizan, Nurul Najwa SK Abdul Jamal, M. N. Abdullah, Masurah Mohamad, N. Khairudin","doi":"10.1109/AiDAS47888.2019.8970722","DOIUrl":null,"url":null,"abstract":"Sentiment analysis is a computational process to identify and classify subjective information such as positive, negative and neutral from the source material. It is able to extract feeling and emotion from a piece of a sentence. This technology has been widely used to extract valuable information from people’s views on social media. Hence, this project aims to classify movie reviews into positives, negatives and neutral polarity using lexicon-based method which used R as the language and development framework. Twitter data is used as the source material. Firstly, tweets were extracted using RStudio and Twitter API. Then data pre-processing was done by removing all the stop words and noises. Next was the tokenization process, which separates the words and matches the separated words with positive and negative words vocabulary. Finally, the result of the sentiment analysis is produced into positive, negative and neutral polarities. The results were evaluated using standard evaluation metrics that are the precision, recall, F1 score and accuracy. After all, it is found that the basic lexicon-based method is able to classify sentiment quite well with 52% accuracy. Apparently, the accuracy value achieved in our experiment is not impressive enough, but it is worth corresponding to the simplicity and minimal cost of development for sentiment analysis on Twitter data for movies.","PeriodicalId":227508,"journal":{"name":"2019 1st International Conference on Artificial Intelligence and Data Sciences (AiDAS)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":"{\"title\":\"Lexicon-Based Sentiment Analysis for Movie Review Tweets\",\"authors\":\"A. Azizan, Nurul Najwa SK Abdul Jamal, M. N. Abdullah, Masurah Mohamad, N. Khairudin\",\"doi\":\"10.1109/AiDAS47888.2019.8970722\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Sentiment analysis is a computational process to identify and classify subjective information such as positive, negative and neutral from the source material. It is able to extract feeling and emotion from a piece of a sentence. This technology has been widely used to extract valuable information from people’s views on social media. Hence, this project aims to classify movie reviews into positives, negatives and neutral polarity using lexicon-based method which used R as the language and development framework. Twitter data is used as the source material. Firstly, tweets were extracted using RStudio and Twitter API. Then data pre-processing was done by removing all the stop words and noises. Next was the tokenization process, which separates the words and matches the separated words with positive and negative words vocabulary. Finally, the result of the sentiment analysis is produced into positive, negative and neutral polarities. The results were evaluated using standard evaluation metrics that are the precision, recall, F1 score and accuracy. After all, it is found that the basic lexicon-based method is able to classify sentiment quite well with 52% accuracy. Apparently, the accuracy value achieved in our experiment is not impressive enough, but it is worth corresponding to the simplicity and minimal cost of development for sentiment analysis on Twitter data for movies.\",\"PeriodicalId\":227508,\"journal\":{\"name\":\"2019 1st International Conference on Artificial Intelligence and Data Sciences (AiDAS)\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"8\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 1st International Conference on Artificial Intelligence and Data Sciences (AiDAS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/AiDAS47888.2019.8970722\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 1st International Conference on Artificial Intelligence and Data Sciences (AiDAS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/AiDAS47888.2019.8970722","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 8

摘要

情感分析是一种从源材料中识别和分类主观信息(如积极、消极和中性)的计算过程。它能够从一个句子中提取感觉和情感。这项技术已被广泛用于从人们在社交媒体上的观点中提取有价值的信息。因此,本项目旨在使用基于词典的方法,使用R作为语言和开发框架,将电影评论分为正面、负面和中性极性。Twitter数据被用作源材料。首先,使用RStudio和Twitter API提取推文。然后对数据进行预处理,去除所有停止词和噪声。接下来是标记化过程,将单词分离出来,并将分离出来的单词与积极词汇和消极词汇进行匹配。最后,情绪分析的结果产生积极,消极和中性极性。使用标准评价指标对结果进行评价,即精密度、召回率、F1分数和准确度。毕竟,我们发现基于词典的基本方法能够很好地分类情感,准确率达到52%。显然,在我们的实验中获得的准确性值还不够令人印象深刻,但它值得对应于对电影Twitter数据进行情感分析的简单性和最小的开发成本。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Lexicon-Based Sentiment Analysis for Movie Review Tweets
Sentiment analysis is a computational process to identify and classify subjective information such as positive, negative and neutral from the source material. It is able to extract feeling and emotion from a piece of a sentence. This technology has been widely used to extract valuable information from people’s views on social media. Hence, this project aims to classify movie reviews into positives, negatives and neutral polarity using lexicon-based method which used R as the language and development framework. Twitter data is used as the source material. Firstly, tweets were extracted using RStudio and Twitter API. Then data pre-processing was done by removing all the stop words and noises. Next was the tokenization process, which separates the words and matches the separated words with positive and negative words vocabulary. Finally, the result of the sentiment analysis is produced into positive, negative and neutral polarities. The results were evaluated using standard evaluation metrics that are the precision, recall, F1 score and accuracy. After all, it is found that the basic lexicon-based method is able to classify sentiment quite well with 52% accuracy. Apparently, the accuracy value achieved in our experiment is not impressive enough, but it is worth corresponding to the simplicity and minimal cost of development for sentiment analysis on Twitter data for movies.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Construction of Fuzzy System for Classification of Heart Disease Based on Phonocardiogram Signal Automated Machine Learning based on Genetic Programming: a case study on a real house pricing dataset Framework Of Malay Intelligent Autonomous Helper (Min@H): Text, Speech And Knowledge Dimension Towards Artificial Wisdom For Future Military Training System Survey of Sea Wave Parameters Classification and Prediction using Machine Leaming Models An optimized Multi-Layer Ensemble Framework for Sentiment Analysis
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1