Towards Generalization of Machine Learning Models: A Case Study of Arabic Sentiment Analysis

Samir Abdaljalil, S. Hassanein, Hamdy Mubarak, Ahmed Abdelali
{"title":"Towards Generalization of Machine Learning Models: A Case Study of Arabic Sentiment Analysis","authors":"Samir Abdaljalil, S. Hassanein, Hamdy Mubarak, Ahmed Abdelali","doi":"10.1609/icwsm.v17i1.22204","DOIUrl":null,"url":null,"abstract":"The abundance of social media data in the Arab world, specifically on Twitter, enabled companies and entities to exploit such rich and beneficial data that could be mined and used to extract important information, including sentiments and opinions of people towards a topic or a merchandise. However, with this plenitude comes the issue of producing models that are able to deliver consistent outcomes when tested within various contexts. Although model generalization has been thoroughly investigated in many fields, it has not been heavily investigated in the Arabic context. To address this gap, we investigate the generalization of models and data in Arabic with application to sentiment analysis, by performing a battery of experiments and building different models that are tested on five independent test sets to understand their performance when presented with unseen data. In doing so, we detail different techniques that improve the generalization of machine learning models in Arabic sentiment analysis, and share a large versatile dataset consisting of approximately 1.64M Arabic tweets and their corresponding sentiment to be used for future research. Our experiments concluded that the most consistent model is trained using a dataset labelled by a cascaded approach of two models, one that labels neutral tweets and another that identifies positive/negative tweets based on the Arabic emoji lexicon after class balancing. Both the BERT and the SVM models trained using the refined data achieve an average F-1 score of 0.62 and 0.60, and standard deviation of 0.06 and 0.04 respectively, when evaluated on five diverse test sets, outperforming other models by at least 17% relative gain in F-1. Based on our experiments, we share recommendations to improve model generalization for classification tasks.","PeriodicalId":175641,"journal":{"name":"International Conference on Web and Social Media","volume":"39 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Conference on Web and Social Media","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1609/icwsm.v17i1.22204","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

The abundance of social media data in the Arab world, specifically on Twitter, enabled companies and entities to exploit such rich and beneficial data that could be mined and used to extract important information, including sentiments and opinions of people towards a topic or a merchandise. However, with this plenitude comes the issue of producing models that are able to deliver consistent outcomes when tested within various contexts. Although model generalization has been thoroughly investigated in many fields, it has not been heavily investigated in the Arabic context. To address this gap, we investigate the generalization of models and data in Arabic with application to sentiment analysis, by performing a battery of experiments and building different models that are tested on five independent test sets to understand their performance when presented with unseen data. In doing so, we detail different techniques that improve the generalization of machine learning models in Arabic sentiment analysis, and share a large versatile dataset consisting of approximately 1.64M Arabic tweets and their corresponding sentiment to be used for future research. Our experiments concluded that the most consistent model is trained using a dataset labelled by a cascaded approach of two models, one that labels neutral tweets and another that identifies positive/negative tweets based on the Arabic emoji lexicon after class balancing. Both the BERT and the SVM models trained using the refined data achieve an average F-1 score of 0.62 and 0.60, and standard deviation of 0.06 and 0.04 respectively, when evaluated on five diverse test sets, outperforming other models by at least 17% relative gain in F-1. Based on our experiments, we share recommendations to improve model generalization for classification tasks.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
机器学习模型的泛化:阿拉伯语情感分析的案例研究
阿拉伯世界丰富的社交媒体数据,特别是Twitter上的数据,使公司和实体能够利用这些丰富而有益的数据,这些数据可以被挖掘并用于提取重要信息,包括人们对某个话题或商品的情绪和观点。然而,随着这种丰富性的出现,产生能够在各种环境中测试时交付一致结果的模型的问题出现了。虽然模型泛化已经在许多领域进行了深入的研究,但在阿拉伯语背景下还没有进行大量的研究。为了解决这一差距,我们研究了阿拉伯语模型和数据的泛化,并将其应用于情感分析,通过执行一系列实验并构建不同的模型,这些模型在五个独立的测试集上进行测试,以了解它们在呈现未知数据时的表现。在此过程中,我们详细介绍了在阿拉伯语情感分析中提高机器学习模型泛化的不同技术,并共享了一个由大约1.64M条阿拉伯语推文及其相应情绪组成的大型通用数据集,用于未来的研究。我们的实验得出结论,最一致的模型是使用两个模型的级联方法标记的数据集来训练的,一个模型标记中性推文,另一个模型在类平衡后基于阿拉伯表情符号词典识别积极/消极推文。使用改进数据训练的BERT和SVM模型在五个不同的测试集上进行评估时,平均F-1得分分别为0.62和0.60,标准差分别为0.06和0.04,在F-1方面至少比其他模型高出17%。基于我们的实验,我们分享了改进分类任务的模型泛化的建议。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
RTANet: Recommendation Target-Aware Network Embedding Who Is behind a Trend? Temporal Analysis of Interactions among Trend Participants on Twitter Host-Centric Social Connectedness of Migrants in Europe on Facebook Recipe Networks and the Principles of Healthy Food on the Web Social Influence-Maximizing Group Recommendation
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1