{"title":"Resources building for sentiment analysis of content disseminated by Tunisian medias in social networks","authors":"Emna Fsih, Rahma Boujelbane, Lamia Hadrich Belguith","doi":"10.1007/s10579-023-09697-6","DOIUrl":null,"url":null,"abstract":"<p>Nowadays, social networks play a fundamental role in promoting and diffusing television and radio programs to different categories of audiences. So, political parties, influential groups and political activists have rapidly seized these new communication media to spread their ideas and give their sentiments concerning critical issues. In this context, Twitter, Facebook and YouTube have become very popular tools for sharing videos and communicating with users who interact with each other to discuss some problems, propose solutions and give viewpoints. This interaction on the social media sites yields to a huge amount of unstructured and noisy texts; hence the need for automated analysis techniques to classify sentiments conveyed in the users’ comments. In this work, we focus on opinions written in a less resourced Arabic language: Tunisian dialect (TD). In this work, we present a process for building a sentiment analyses model for comments written on Tunisian television broadcasts published in social media. These comments are written in a particular way with different spellings due to the fact that the Tunisian Dialect (TD) does not have an orthographic standard. For this we design crucial resources, namely sentiment lexicon and annotated corpus that we have used to investigate machine-learning and deep-learning models in order to identify the best sentiment analysis model for Tunisian Dialect.</p>","PeriodicalId":49927,"journal":{"name":"Language Resources and Evaluation","volume":"563 1","pages":""},"PeriodicalIF":1.7000,"publicationDate":"2023-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Language Resources and Evaluation","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s10579-023-09697-6","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 1
Abstract
Nowadays, social networks play a fundamental role in promoting and diffusing television and radio programs to different categories of audiences. So, political parties, influential groups and political activists have rapidly seized these new communication media to spread their ideas and give their sentiments concerning critical issues. In this context, Twitter, Facebook and YouTube have become very popular tools for sharing videos and communicating with users who interact with each other to discuss some problems, propose solutions and give viewpoints. This interaction on the social media sites yields to a huge amount of unstructured and noisy texts; hence the need for automated analysis techniques to classify sentiments conveyed in the users’ comments. In this work, we focus on opinions written in a less resourced Arabic language: Tunisian dialect (TD). In this work, we present a process for building a sentiment analyses model for comments written on Tunisian television broadcasts published in social media. These comments are written in a particular way with different spellings due to the fact that the Tunisian Dialect (TD) does not have an orthographic standard. For this we design crucial resources, namely sentiment lexicon and annotated corpus that we have used to investigate machine-learning and deep-learning models in order to identify the best sentiment analysis model for Tunisian Dialect.
期刊介绍:
Language Resources and Evaluation is the first publication devoted to the acquisition, creation, annotation, and use of language resources, together with methods for evaluation of resources, technologies, and applications.
Language resources include language data and descriptions in machine readable form used to assist and augment language processing applications, such as written or spoken corpora and lexica, multimodal resources, grammars, terminology or domain specific databases and dictionaries, ontologies, multimedia databases, etc., as well as basic software tools for their acquisition, preparation, annotation, management, customization, and use.
Evaluation of language resources concerns assessing the state-of-the-art for a given technology, comparing different approaches to a given problem, assessing the availability of resources and technologies for a given application, benchmarking, and assessing system usability and user satisfaction.