豪萨语缩略语的推特情感分析

Habeeba Ibraheem Abdullahi, Muhammad Aminu Ahmad, Khalid Haruna
{"title":"豪萨语缩略语的推特情感分析","authors":"Habeeba Ibraheem Abdullahi, Muhammad Aminu Ahmad, Khalid Haruna","doi":"10.4314/swj.v19i1.13","DOIUrl":null,"url":null,"abstract":"The use of natural language processing, to identify, extract and organize sentiment from user generated texts in social networks, blogs or product review of text is known as sentiment analysis or opinion mining. Hausa language belongs to one of the major well-spoken languages in Africa and one of the three major Nigerian languages. Now investigating into such a language will have significant influence on social, economic business political and even educational services and settings. Some of these Hausa texts are abbreviated and some in acronym format which is a challenge to researchers as such comments are in an unstructured format and needs normalization to get further understanding of that text and also there is scarcity of sentiment analysis on Hausa abbreviation and acronym. Abbreviation is a shorten form of a word while acronym is an abbreviation formed from the initial letters of other words and pronounced as a word. This research aims to develop an improved Hausa Sentiment Dataset for the enhancement of sentiment analysis with abbreviation and acronyms. This is achieved by adapting to the approach for Hausa Sentiment Analysis based on Multinomial Naïve Bayes (MNB) and Logistic Regression algorithms using the count vectorizer, along with python libraries for NLP. This research affirmed that the improved dataset with abbreviation and acronym outperforms the plain Hausa dataset by 4% in accuracy using Multinomial Naïve Bayes. The result shows that in addition to normal preprocessing techniques of the social media stream, understanding, interpreting and resolving ambiguity in the usage of abbreviations and acronyms lead to improved accuracy of algorithms with evidence in the experimental result.","PeriodicalId":21583,"journal":{"name":"Science World Journal","volume":"17 23","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Twitter sentiment analysis for Hausa abbreviations and acronyms\",\"authors\":\"Habeeba Ibraheem Abdullahi, Muhammad Aminu Ahmad, Khalid Haruna\",\"doi\":\"10.4314/swj.v19i1.13\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The use of natural language processing, to identify, extract and organize sentiment from user generated texts in social networks, blogs or product review of text is known as sentiment analysis or opinion mining. Hausa language belongs to one of the major well-spoken languages in Africa and one of the three major Nigerian languages. Now investigating into such a language will have significant influence on social, economic business political and even educational services and settings. Some of these Hausa texts are abbreviated and some in acronym format which is a challenge to researchers as such comments are in an unstructured format and needs normalization to get further understanding of that text and also there is scarcity of sentiment analysis on Hausa abbreviation and acronym. Abbreviation is a shorten form of a word while acronym is an abbreviation formed from the initial letters of other words and pronounced as a word. This research aims to develop an improved Hausa Sentiment Dataset for the enhancement of sentiment analysis with abbreviation and acronyms. This is achieved by adapting to the approach for Hausa Sentiment Analysis based on Multinomial Naïve Bayes (MNB) and Logistic Regression algorithms using the count vectorizer, along with python libraries for NLP. This research affirmed that the improved dataset with abbreviation and acronym outperforms the plain Hausa dataset by 4% in accuracy using Multinomial Naïve Bayes. The result shows that in addition to normal preprocessing techniques of the social media stream, understanding, interpreting and resolving ambiguity in the usage of abbreviations and acronyms lead to improved accuracy of algorithms with evidence in the experimental result.\",\"PeriodicalId\":21583,\"journal\":{\"name\":\"Science World Journal\",\"volume\":\"17 23\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-05-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Science World Journal\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.4314/swj.v19i1.13\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Science World Journal","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.4314/swj.v19i1.13","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

使用自然语言处理技术从社交网络、博客或产品评论文本中的用户生成文本中识别、提取和整理情感,被称为情感分析或意见挖掘。豪萨语属于非洲主要的口语语言之一,也是尼日利亚三大语言之一。现在,对这种语言的调查将对社会、经济、商业、政治甚至教育服务和环境产生重大影响。这些豪萨语文本中有些是缩写,有些是首字母缩写,这对研究人员来说是一个挑战,因为这些评论是非结构化的格式,需要进行规范化处理才能进一步理解这些文本,而且对豪萨语缩写和首字母缩写的情感分析也很匮乏。缩写是单词的简称,而首字母缩略词是由其他单词的首字母缩写而成,并作为一个单词发音。本研究旨在开发一个改进的豪萨语情感数据集,以加强对缩写和首字母缩略词的情感分析。这是通过调整基于多项式奈夫贝叶斯(MNB)和逻辑回归算法的豪萨语情感分析方法,并使用计数矢量器和用于 NLP 的 python 库来实现的。这项研究证实,使用多项式奈维贝叶斯算法,包含缩写和首字母缩写的改进数据集的准确率比普通豪萨语数据集高出 4%。结果表明,除了社交媒体流的正常预处理技术外,理解、解释和解决缩写和首字母缩略词使用中的歧义也能提高算法的准确性,实验结果也证明了这一点。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Twitter sentiment analysis for Hausa abbreviations and acronyms
The use of natural language processing, to identify, extract and organize sentiment from user generated texts in social networks, blogs or product review of text is known as sentiment analysis or opinion mining. Hausa language belongs to one of the major well-spoken languages in Africa and one of the three major Nigerian languages. Now investigating into such a language will have significant influence on social, economic business political and even educational services and settings. Some of these Hausa texts are abbreviated and some in acronym format which is a challenge to researchers as such comments are in an unstructured format and needs normalization to get further understanding of that text and also there is scarcity of sentiment analysis on Hausa abbreviation and acronym. Abbreviation is a shorten form of a word while acronym is an abbreviation formed from the initial letters of other words and pronounced as a word. This research aims to develop an improved Hausa Sentiment Dataset for the enhancement of sentiment analysis with abbreviation and acronyms. This is achieved by adapting to the approach for Hausa Sentiment Analysis based on Multinomial Naïve Bayes (MNB) and Logistic Regression algorithms using the count vectorizer, along with python libraries for NLP. This research affirmed that the improved dataset with abbreviation and acronym outperforms the plain Hausa dataset by 4% in accuracy using Multinomial Naïve Bayes. The result shows that in addition to normal preprocessing techniques of the social media stream, understanding, interpreting and resolving ambiguity in the usage of abbreviations and acronyms lead to improved accuracy of algorithms with evidence in the experimental result.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Ecological observations of freshwater snails in the vicinity of an artificial lake Insights into folic acid mixtures compounded with commercially available vitamin syrups Linkages between economic growth, health expenditures, education, and environment: dynamic analysis of Nigeria Biosurfactant production potentials of microorganisms isolated from atmosphere of five petroleum stations at Tanke, Ilorin, Kwara State, Nigeria Biodegradation of bonny light crude oil by plasmid and non-plasmid borne soil bacterial strains using biostimulation and bioaugmentation techniques
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1