{"title":"基于BERT-CNN的阿拉伯语低资源数据情感分析","authors":"Mohamed Fawzy, M. Fakhr, M. A. Rizka","doi":"10.1109/ESOLEC54569.2022.10009633","DOIUrl":null,"url":null,"abstract":"Users share opinions and discussions on the internet through social media platforms. Nowadays, a significant number of internet users speak the Arabic language. They tend to express their opinions using different dialects. Therefore, understanding people's opinions and emotions become an urgent matter. The Arabic sentiment analysis is challenging because of linguistic complexity, data availability, and data quality, and it has multiple dialects. Therefore, research for low resources sentiment analysis became necessary. This study proposes a Bidirectional Encoder Representations from Transformers (BERT) that uses Convolutional Neural Network (CNN) as a classification head for Arabic low data resources for sentiment analysis. The classification head includes the CNN layer, drop-out layer, and a Relu activation function. The proposed approach experimented on three datasets collected from Twitter containing different dialects. The last four BERT layers were fined-tuned and while other layers were frozen. The suggested model outperforms current state-of-the-art models' accuracy with 50% fewer batch size, fewer training layers, and ∼20% fewer epochs.","PeriodicalId":179850,"journal":{"name":"2022 20th International Conference on Language Engineering (ESOLEC)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2022-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Sentiment Analysis For Arabic Low Resource Data Using BERT-CNN\",\"authors\":\"Mohamed Fawzy, M. Fakhr, M. A. Rizka\",\"doi\":\"10.1109/ESOLEC54569.2022.10009633\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Users share opinions and discussions on the internet through social media platforms. Nowadays, a significant number of internet users speak the Arabic language. They tend to express their opinions using different dialects. Therefore, understanding people's opinions and emotions become an urgent matter. The Arabic sentiment analysis is challenging because of linguistic complexity, data availability, and data quality, and it has multiple dialects. Therefore, research for low resources sentiment analysis became necessary. This study proposes a Bidirectional Encoder Representations from Transformers (BERT) that uses Convolutional Neural Network (CNN) as a classification head for Arabic low data resources for sentiment analysis. The classification head includes the CNN layer, drop-out layer, and a Relu activation function. The proposed approach experimented on three datasets collected from Twitter containing different dialects. The last four BERT layers were fined-tuned and while other layers were frozen. The suggested model outperforms current state-of-the-art models' accuracy with 50% fewer batch size, fewer training layers, and ∼20% fewer epochs.\",\"PeriodicalId\":179850,\"journal\":{\"name\":\"2022 20th International Conference on Language Engineering (ESOLEC)\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-10-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 20th International Conference on Language Engineering (ESOLEC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ESOLEC54569.2022.10009633\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 20th International Conference on Language Engineering (ESOLEC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ESOLEC54569.2022.10009633","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Sentiment Analysis For Arabic Low Resource Data Using BERT-CNN
Users share opinions and discussions on the internet through social media platforms. Nowadays, a significant number of internet users speak the Arabic language. They tend to express their opinions using different dialects. Therefore, understanding people's opinions and emotions become an urgent matter. The Arabic sentiment analysis is challenging because of linguistic complexity, data availability, and data quality, and it has multiple dialects. Therefore, research for low resources sentiment analysis became necessary. This study proposes a Bidirectional Encoder Representations from Transformers (BERT) that uses Convolutional Neural Network (CNN) as a classification head for Arabic low data resources for sentiment analysis. The classification head includes the CNN layer, drop-out layer, and a Relu activation function. The proposed approach experimented on three datasets collected from Twitter containing different dialects. The last four BERT layers were fined-tuned and while other layers were frozen. The suggested model outperforms current state-of-the-art models' accuracy with 50% fewer batch size, fewer training layers, and ∼20% fewer epochs.