{"title":"基于贝叶斯推理的海湾辩证法阿拉伯语推文恶意数据发现","authors":"Dema Alorini, D. Rawat","doi":"10.1109/ISTAS.2018.8638164","DOIUrl":null,"url":null,"abstract":"One of the largest domains for written communication is the on-line domain. Today, social media has become widely used among people of different ages, groups and nationalities. In the Gulf region, Twitter is one of popular social networking sites. Tweets do not only contain information about opinions, news, and conversations, but also contain malicious content such as false information, malicious links, and other types of cyber threats. Therefore, those tweets need to be identified first in order to discover whether it is malicious or not. Tweets from the Gulf region are not written in the Modern Standard Language (MSA), which is used in most translation systems as an Arabic source. In this paper, we first present a Gulf Dialectical Arabic (Gulf DA) to English dataset in order to create a Gulf Knowledge Base (GulfKB). Then, we use the GulfKB model-based reasoning that is based on Bayesian inference to uncover malicious content and suspicious users. We have evaluated the proposed approach using numerical results. Our approach gives accuracy of 91% and outperforms the existing approaches in the state of art literature.","PeriodicalId":122477,"journal":{"name":"2018 IEEE International Symposium on Technology and Society (ISTAS)","volume":"37 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Bayesian Reasoning Based Malicious Data Discovery on Gulf-Dialectical Arabic Tweets\",\"authors\":\"Dema Alorini, D. Rawat\",\"doi\":\"10.1109/ISTAS.2018.8638164\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"One of the largest domains for written communication is the on-line domain. Today, social media has become widely used among people of different ages, groups and nationalities. In the Gulf region, Twitter is one of popular social networking sites. Tweets do not only contain information about opinions, news, and conversations, but also contain malicious content such as false information, malicious links, and other types of cyber threats. Therefore, those tweets need to be identified first in order to discover whether it is malicious or not. Tweets from the Gulf region are not written in the Modern Standard Language (MSA), which is used in most translation systems as an Arabic source. In this paper, we first present a Gulf Dialectical Arabic (Gulf DA) to English dataset in order to create a Gulf Knowledge Base (GulfKB). Then, we use the GulfKB model-based reasoning that is based on Bayesian inference to uncover malicious content and suspicious users. We have evaluated the proposed approach using numerical results. Our approach gives accuracy of 91% and outperforms the existing approaches in the state of art literature.\",\"PeriodicalId\":122477,\"journal\":{\"name\":\"2018 IEEE International Symposium on Technology and Society (ISTAS)\",\"volume\":\"37 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 IEEE International Symposium on Technology and Society (ISTAS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ISTAS.2018.8638164\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE International Symposium on Technology and Society (ISTAS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISTAS.2018.8638164","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Bayesian Reasoning Based Malicious Data Discovery on Gulf-Dialectical Arabic Tweets
One of the largest domains for written communication is the on-line domain. Today, social media has become widely used among people of different ages, groups and nationalities. In the Gulf region, Twitter is one of popular social networking sites. Tweets do not only contain information about opinions, news, and conversations, but also contain malicious content such as false information, malicious links, and other types of cyber threats. Therefore, those tweets need to be identified first in order to discover whether it is malicious or not. Tweets from the Gulf region are not written in the Modern Standard Language (MSA), which is used in most translation systems as an Arabic source. In this paper, we first present a Gulf Dialectical Arabic (Gulf DA) to English dataset in order to create a Gulf Knowledge Base (GulfKB). Then, we use the GulfKB model-based reasoning that is based on Bayesian inference to uncover malicious content and suspicious users. We have evaluated the proposed approach using numerical results. Our approach gives accuracy of 91% and outperforms the existing approaches in the state of art literature.