用于情感挖掘的阿拉伯语方言预处理:技术现状

Zineb Nassr, N. Sael, F. Benabbou
{"title":"用于情感挖掘的阿拉伯语方言预处理:技术现状","authors":"Zineb Nassr, N. Sael, F. Benabbou","doi":"10.5194/ISPRS-ARCHIVES-XLIV-4-W3-2020-323-2020","DOIUrl":null,"url":null,"abstract":"Abstract. Sentiment Analysis concerns the analysis of ideas, emotions, evaluations, values, attitudes and feelings about products, services, companies, individuals, tasks, events, titles and their characteristics. With the increase in applications on the Internet and social networks, Sentiment Analysis has become more crucial in the field of text mining research and has since been used to explore users’ opinions on various products or topics discussed on the Internet. Developments in the fields of Natural Language Processing and Computational Linguistics have contributed positively to Sentiment Analysis studies, especially for sentiments written in non-structured or semi-structured languages. In this paper, we present a literature review on the pre-processing task on the field of sentiment analysis and an analytical and comparative study of different researches conducted in Arabic social networks. This study allowed as concluding that several works have dealt with the generation of stop words dictionary. In this context, two approaches are adopted: first, the manual one, which gives rise to a limited list, and second, the automatic, where the list of stop words is extracted from social networks based on defined rules. For stemming two, algorithms have been proposed to isolate prefixes and suffixes from words in dialects. However, few works have been interested in dialects directly without translation. The Moroccan dialect in particular is considered as the 5th dialect studied among Arabic dialects after Jordanian, Egyptian, Tunisian and Algerian dialects. Despite the significant lack in studies carried out on Arabic dialects, we were able to extract several conclusions about the difficulties and challenges encountered through this comparative study, as well as the possible ways and tracks to study in any dialects sentiment analysis pre-processing solution.","PeriodicalId":14757,"journal":{"name":"ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences","volume":"1 1","pages":"323-330"},"PeriodicalIF":0.0000,"publicationDate":"2020-11-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"PREPROCESSING ARABIC DIALECT FOR SENTIMENT MINING: STATE OF ART\",\"authors\":\"Zineb Nassr, N. Sael, F. Benabbou\",\"doi\":\"10.5194/ISPRS-ARCHIVES-XLIV-4-W3-2020-323-2020\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Abstract. Sentiment Analysis concerns the analysis of ideas, emotions, evaluations, values, attitudes and feelings about products, services, companies, individuals, tasks, events, titles and their characteristics. With the increase in applications on the Internet and social networks, Sentiment Analysis has become more crucial in the field of text mining research and has since been used to explore users’ opinions on various products or topics discussed on the Internet. Developments in the fields of Natural Language Processing and Computational Linguistics have contributed positively to Sentiment Analysis studies, especially for sentiments written in non-structured or semi-structured languages. In this paper, we present a literature review on the pre-processing task on the field of sentiment analysis and an analytical and comparative study of different researches conducted in Arabic social networks. This study allowed as concluding that several works have dealt with the generation of stop words dictionary. In this context, two approaches are adopted: first, the manual one, which gives rise to a limited list, and second, the automatic, where the list of stop words is extracted from social networks based on defined rules. For stemming two, algorithms have been proposed to isolate prefixes and suffixes from words in dialects. However, few works have been interested in dialects directly without translation. The Moroccan dialect in particular is considered as the 5th dialect studied among Arabic dialects after Jordanian, Egyptian, Tunisian and Algerian dialects. Despite the significant lack in studies carried out on Arabic dialects, we were able to extract several conclusions about the difficulties and challenges encountered through this comparative study, as well as the possible ways and tracks to study in any dialects sentiment analysis pre-processing solution.\",\"PeriodicalId\":14757,\"journal\":{\"name\":\"ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences\",\"volume\":\"1 1\",\"pages\":\"323-330\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-11-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.5194/ISPRS-ARCHIVES-XLIV-4-W3-2020-323-2020\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5194/ISPRS-ARCHIVES-XLIV-4-W3-2020-323-2020","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

摘要

摘要情感分析涉及对产品、服务、公司、个人、任务、事件、头衔及其特征的想法、情感、评价、价值观、态度和感受的分析。随着互联网和社交网络应用的增加,情感分析在文本挖掘研究领域变得越来越重要,并已被用于探索用户对互联网上讨论的各种产品或话题的看法。自然语言处理和计算语言学领域的发展对情感分析研究做出了积极的贡献,特别是对用非结构化或半结构化语言编写的情感。在本文中,我们对情感分析领域的预处理任务进行了文献综述,并对在阿拉伯社会网络中进行的不同研究进行了分析和比较研究。本研究认为,已有多篇论著涉及停词词典的生成。在这种情况下,采用了两种方法:第一种是手动方法,它产生一个有限的列表;第二种是自动方法,其中根据定义的规则从社交网络中提取停止词列表。对于词干二,已经提出了从方言单词中分离前缀和后缀的算法。然而,很少有作品直接对方言感兴趣而不进行翻译。特别是摩洛哥方言被认为是继约旦方言、埃及方言、突尼斯方言和阿尔及利亚方言之后研究的第5种阿拉伯方言。尽管对阿拉伯语方言的研究明显缺乏,但我们能够通过比较研究得出一些关于遇到的困难和挑战的结论,以及在任何方言情感分析预处理解决方案中可能的研究方法和轨迹。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
PREPROCESSING ARABIC DIALECT FOR SENTIMENT MINING: STATE OF ART
Abstract. Sentiment Analysis concerns the analysis of ideas, emotions, evaluations, values, attitudes and feelings about products, services, companies, individuals, tasks, events, titles and their characteristics. With the increase in applications on the Internet and social networks, Sentiment Analysis has become more crucial in the field of text mining research and has since been used to explore users’ opinions on various products or topics discussed on the Internet. Developments in the fields of Natural Language Processing and Computational Linguistics have contributed positively to Sentiment Analysis studies, especially for sentiments written in non-structured or semi-structured languages. In this paper, we present a literature review on the pre-processing task on the field of sentiment analysis and an analytical and comparative study of different researches conducted in Arabic social networks. This study allowed as concluding that several works have dealt with the generation of stop words dictionary. In this context, two approaches are adopted: first, the manual one, which gives rise to a limited list, and second, the automatic, where the list of stop words is extracted from social networks based on defined rules. For stemming two, algorithms have been proposed to isolate prefixes and suffixes from words in dialects. However, few works have been interested in dialects directly without translation. The Moroccan dialect in particular is considered as the 5th dialect studied among Arabic dialects after Jordanian, Egyptian, Tunisian and Algerian dialects. Despite the significant lack in studies carried out on Arabic dialects, we were able to extract several conclusions about the difficulties and challenges encountered through this comparative study, as well as the possible ways and tracks to study in any dialects sentiment analysis pre-processing solution.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
A NOVEL GREEDY GENETIC ALGORITHM TO SOLVE COMBINATORIAL OPTIMIZATION PROBLEM POST-EARTHQUAKE 3D BUILDING MODEL (LOD2) GENERATION FROM UAS IMAGERY: THE CASE OF VRISA TRADITIONAL SETTLEMENT, LESVOS, GREECE DECISIONAL TREE MODELS FOR LAND COVER MAPPING AND CHANGE DETECTION BASED ON PHENOLOGICAL BEHAVIORS. APPLICATION CASE: LOCALIZATION OF NON-FULLY-EXPLOITED AGRICULTURAL SURFACES IN THE EASTERN PART OF THE HAOUZ PLAIN IN THE SEMI-ARID CENTRAL MOROCCO NATIONAL SMART CITIES STRATEGY AND ACTION PLAN: THE TURKEY’S SMART CITIES APPROACH EXPLOITATION OF THE DOMESTIC WASTEWATER TREATMENT PLANT BY ACTIVATED SLUDGE IN THE AIRPORT AREA OF THE CITY BEN SLIMANE (MOROCCO)
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1