Arabic Pilgrim Services Dataset: Creating and Analysis

Hassanin M. Al-Barhamtoshy, Hanen Himdi, Mohamad Alyahya
{"title":"Arabic Pilgrim Services Dataset: Creating and Analysis","authors":"Hassanin M. Al-Barhamtoshy, Hanen Himdi, Mohamad Alyahya","doi":"10.1109/ICAISC56366.2023.10085561","DOIUrl":null,"url":null,"abstract":"With Countless Arabic news articles published daily; users have become increasingly concerned about obtaining news from credible sources. Nonetheless, to individuals, credible news sources are associated with certain countries where users have faith. Therefore, detecting the source of a news article is imperative to fake news detection and enables users a better trust in their consuming news. This paper introduces to create, filter, analyze, and evaluate a domain services-specific Arabic dataset for pilgrims. The Arabic Pilgrim Services (ArPiS) dataset is a collection of approximately 30,000 news, collected across three different Arabic countries and regions. The paper presents a creation for pilgrims’ opinions measurement services dataset for text mining, text classification, clustering, and text summarization. The default basic search methods start with 124 web sites of Arabic news. Then, many of filtering features have been done to limit the dataset by pilgrim subjected services. A lot of topics are addressed, and a lot of filter with a discussion group have been made with many opinions| and extra comments. The huge of the collected data need some kind of additional effort and more analysis to produce valuable dataset. Balanced dataset is one of this extra effort, we are going to create. Therefore, the collected and annotated dataset represents real news for pilgrims’ services. So, we need to build additional quantity of these data to be fake news. Accordingly, a precondition procedure invoked as a methodology to create and then annotate such dataset.","PeriodicalId":422888,"journal":{"name":"2023 1st International Conference on Advanced Innovations in Smart Cities (ICAISC)","volume":"75 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 1st International Conference on Advanced Innovations in Smart Cities (ICAISC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICAISC56366.2023.10085561","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

With Countless Arabic news articles published daily; users have become increasingly concerned about obtaining news from credible sources. Nonetheless, to individuals, credible news sources are associated with certain countries where users have faith. Therefore, detecting the source of a news article is imperative to fake news detection and enables users a better trust in their consuming news. This paper introduces to create, filter, analyze, and evaluate a domain services-specific Arabic dataset for pilgrims. The Arabic Pilgrim Services (ArPiS) dataset is a collection of approximately 30,000 news, collected across three different Arabic countries and regions. The paper presents a creation for pilgrims’ opinions measurement services dataset for text mining, text classification, clustering, and text summarization. The default basic search methods start with 124 web sites of Arabic news. Then, many of filtering features have been done to limit the dataset by pilgrim subjected services. A lot of topics are addressed, and a lot of filter with a discussion group have been made with many opinions| and extra comments. The huge of the collected data need some kind of additional effort and more analysis to produce valuable dataset. Balanced dataset is one of this extra effort, we are going to create. Therefore, the collected and annotated dataset represents real news for pilgrims’ services. So, we need to build additional quantity of these data to be fake news. Accordingly, a precondition procedure invoked as a methodology to create and then annotate such dataset.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
阿拉伯朝圣者服务数据集:创建和分析
每天都有无数阿拉伯新闻文章发表;用户越来越关注从可靠来源获取新闻。尽管如此,对于个人而言,可信的新闻来源与用户有信仰的某些国家有关。因此,检测新闻文章的来源是假新闻检测的必要条件,可以让用户对自己消费的新闻有更好的信任度。本文介绍了如何为朝圣者创建、过滤、分析和评估特定于域服务的阿拉伯语数据集。阿拉伯朝圣者服务(ArPiS)数据集收集了大约30,000条新闻,来自三个不同的阿拉伯国家和地区。本文提出了一种用于文本挖掘、文本分类、聚类和文本摘要的朝圣者意见度量服务数据集的创建方法。默认的基本搜索方法从124个阿拉伯新闻网站开始。在此基础上,对数据集进行了过滤,限制了朝圣者的服务。讨论了很多话题,讨论组里也有很多意见和评论。收集的大量数据需要一些额外的努力和更多的分析来产生有价值的数据集。平衡数据集是我们将要创建的额外工作之一。因此,收集和注释的数据集代表了朝圣者服务的真实新闻。所以,我们需要建立额外数量的这些数据来制作假新闻。因此,调用一个先决条件过程作为一种方法来创建和注释这样的数据集。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Twitter Sentimental Analysis using Machine Learning Approaches for SemeVal Dataset Crowdsensing Technologies for Optimizing Passenger Flows in Public Transport Flash Flood Simulation for Assisting Children to Understand the Flood Disaster Blockchain Integration with Machine Learning for Securing Fog Computing Vulnerability in Smart City Sustainability Detect misinformation of COVID-19 using deep learning: A comparative study based on word embedding
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1