基于方面的社交媒体数据情感分析与预训练语言模型

Proceedings of the 2021 5th International Conference on Natural Language Processing and Information Retrieval Pub Date : 2021-12-17 DOI:10.1145/3508230.3508232

Anina Troya, Reshmi Gopalakrishna Pillai, Cristian Rodriguez Rivero, Zülküf Genç, S. Kayal, Dogu Araci

{"title":"基于方面的社交媒体数据情感分析与预训练语言模型","authors":"Anina Troya, Reshmi Gopalakrishna Pillai, Cristian Rodriguez Rivero, Zülküf Genç, S. Kayal, Dogu Araci","doi":"10.1145/3508230.3508232","DOIUrl":null,"url":null,"abstract":"There is a great scope in utilizing the increasing content expressed by users on social media platforms such as Twitter. This study explores the application of Aspect-based Sentiment Analysis (ABSA) of tweets to retrieve fine-grained sentiment insights. The Plant-based food domain is chosen as an area of focus. To the best of our knowledge this is the first time ABSA task is done for this sector and it is distinct from standard food products because different and controversial aspects arise and opinions are polarized. The choice is relevant because these products can help in meeting the sustainable development goals and improve the welfare of millions of animals. Pre-trained BERT,”Bidirectional Encoder Representations with transformers”, is fine-tuned for this task and stands out because it was trained to learn from all the words in the sentence simultaneously using transformers. The aim was to develop methods to be applied on real life cases, therefore lowering the dependency on labeled data and improving performance were the key objectives. This research contributes to existing approaches of ABSA by proposing data processing techniques to adapt social media data for ABSA. The scope of this project presents a new method for the aspect category detection task (ACD) which does not rely on labeled data by using regular expressions (Regex). For aspect the sentiment classification task (ASC) a semi-supervised learning technique is explored. Additionally Part-of-Speech (POS) tags are incorporated into the predictions. The findings show that Regex is a solution to eliminate the dependency on labeled data for ACD. For ASC fine-tuning BERT on a small subset of data was the most accurate method to lower the dependency on aspect level sentiment data.","PeriodicalId":252146,"journal":{"name":"Proceedings of the 2021 5th International Conference on Natural Language Processing and Information Retrieval","volume":"8 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Aspect-Based Sentiment Analysis of Social Media Data With Pre-Trained Language Models\",\"authors\":\"Anina Troya, Reshmi Gopalakrishna Pillai, Cristian Rodriguez Rivero, Zülküf Genç, S. Kayal, Dogu Araci\",\"doi\":\"10.1145/3508230.3508232\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"There is a great scope in utilizing the increasing content expressed by users on social media platforms such as Twitter. This study explores the application of Aspect-based Sentiment Analysis (ABSA) of tweets to retrieve fine-grained sentiment insights. The Plant-based food domain is chosen as an area of focus. To the best of our knowledge this is the first time ABSA task is done for this sector and it is distinct from standard food products because different and controversial aspects arise and opinions are polarized. The choice is relevant because these products can help in meeting the sustainable development goals and improve the welfare of millions of animals. Pre-trained BERT,”Bidirectional Encoder Representations with transformers”, is fine-tuned for this task and stands out because it was trained to learn from all the words in the sentence simultaneously using transformers. The aim was to develop methods to be applied on real life cases, therefore lowering the dependency on labeled data and improving performance were the key objectives. This research contributes to existing approaches of ABSA by proposing data processing techniques to adapt social media data for ABSA. The scope of this project presents a new method for the aspect category detection task (ACD) which does not rely on labeled data by using regular expressions (Regex). For aspect the sentiment classification task (ASC) a semi-supervised learning technique is explored. Additionally Part-of-Speech (POS) tags are incorporated into the predictions. The findings show that Regex is a solution to eliminate the dependency on labeled data for ACD. For ASC fine-tuning BERT on a small subset of data was the most accurate method to lower the dependency on aspect level sentiment data.\",\"PeriodicalId\":252146,\"journal\":{\"name\":\"Proceedings of the 2021 5th International Conference on Natural Language Processing and Information Retrieval\",\"volume\":\"8 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-12-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2021 5th International Conference on Natural Language Processing and Information Retrieval\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3508230.3508232\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2021 5th International Conference on Natural Language Processing and Information Retrieval","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3508230.3508232","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

摘要

用户在Twitter等社交媒体平台上表达的内容越来越多，利用这些内容的空间很大。本研究探讨了推文基于方面的情感分析(ABSA)的应用，以检索细粒度的情感洞察。植物性食品领域被选为一个重点领域。据我们所知，这是ABSA第一次为该部门完成任务，它与标准食品不同，因为不同和有争议的方面出现了，意见两极分化。这种选择是有意义的，因为这些产品可以帮助实现可持续发展目标，并改善数百万动物的福利。预训练的BERT，“使用变压器的双向编码器表示”，针对这项任务进行了微调，并且脱颖而出，因为它被训练为同时使用变压器从句子中的所有单词中学习。其目的是开发应用于现实生活案例的方法，因此降低对标记数据的依赖并提高性能是关键目标。本研究通过提出数据处理技术使社交媒体数据适应ABSA，为ABSA的现有方法做出贡献。本课题提出了一种不依赖于正则表达式(Regex)标记数据的方面类别检测任务(ACD)新方法。在情感分类任务方面，探讨了一种半监督学习技术。此外，词性(POS)标签被纳入预测。结果表明，Regex是消除ACD对标记数据依赖的一种解决方案。对于ASC来说，在一小部分数据上微调BERT是降低对方面级情绪数据依赖的最准确的方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Aspect-Based Sentiment Analysis of Social Media Data With Pre-Trained Language Models

There is a great scope in utilizing the increasing content expressed by users on social media platforms such as Twitter. This study explores the application of Aspect-based Sentiment Analysis (ABSA) of tweets to retrieve fine-grained sentiment insights. The Plant-based food domain is chosen as an area of focus. To the best of our knowledge this is the first time ABSA task is done for this sector and it is distinct from standard food products because different and controversial aspects arise and opinions are polarized. The choice is relevant because these products can help in meeting the sustainable development goals and improve the welfare of millions of animals. Pre-trained BERT,”Bidirectional Encoder Representations with transformers”, is fine-tuned for this task and stands out because it was trained to learn from all the words in the sentence simultaneously using transformers. The aim was to develop methods to be applied on real life cases, therefore lowering the dependency on labeled data and improving performance were the key objectives. This research contributes to existing approaches of ABSA by proposing data processing techniques to adapt social media data for ABSA. The scope of this project presents a new method for the aspect category detection task (ACD) which does not rely on labeled data by using regular expressions (Regex). For aspect the sentiment classification task (ASC) a semi-supervised learning technique is explored. Additionally Part-of-Speech (POS) tags are incorporated into the predictions. The findings show that Regex is a solution to eliminate the dependency on labeled data for ACD. For ASC fine-tuning BERT on a small subset of data was the most accurate method to lower the dependency on aspect level sentiment data.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the 2021 5th International Conference on Natural Language Processing and Information Retrieval

自引率

0.00%

发文量