对双语罗马乌尔都短信垃圾邮件过滤研究的贡献

2015 National Software Engineering Conference (NSEC) Pub Date : 2015-12-01 DOI:10.1109/NSEC.2015.7396343

K. Mehmood, H. Afzal, A. Majeed, Hassan Latif

{"title":"对双语罗马乌尔都短信垃圾邮件过滤研究的贡献","authors":"K. Mehmood, H. Afzal, A. Majeed, Hassan Latif","doi":"10.1109/NSEC.2015.7396343","DOIUrl":null,"url":null,"abstract":"With the increased usage of internet and mobile phones, number of spams has also increased in both these areas. The Spam in both these areas is an increasing threat and sometimes cause huge financial as well as data/confidentiality loss. Therefore, actions need to be taken to stop these spams on both media. This paper analyses various techniques that are currently being used in Spam filtering in the context of mobile text messages. The contents of SMS are unique in nature so some techniques might be effective while some might not be. Some of mostly used algorithms and techniques are discussed in this paper. Furthermore, we have performed automatic spam filtering using machine learning algorithms on Roman Urdu text messages and achieved an accuracy of 92.2% on a manually curated corpus of 8449 messages. The SMS corpus has also been made available for future research works.","PeriodicalId":113822,"journal":{"name":"2015 National Software Engineering Conference (NSEC)","volume":"43 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Contributions to the study of bi-lingual Roman Urdu SMS spam filtering\",\"authors\":\"K. Mehmood, H. Afzal, A. Majeed, Hassan Latif\",\"doi\":\"10.1109/NSEC.2015.7396343\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"With the increased usage of internet and mobile phones, number of spams has also increased in both these areas. The Spam in both these areas is an increasing threat and sometimes cause huge financial as well as data/confidentiality loss. Therefore, actions need to be taken to stop these spams on both media. This paper analyses various techniques that are currently being used in Spam filtering in the context of mobile text messages. The contents of SMS are unique in nature so some techniques might be effective while some might not be. Some of mostly used algorithms and techniques are discussed in this paper. Furthermore, we have performed automatic spam filtering using machine learning algorithms on Roman Urdu text messages and achieved an accuracy of 92.2% on a manually curated corpus of 8449 messages. The SMS corpus has also been made available for future research works.\",\"PeriodicalId\":113822,\"journal\":{\"name\":\"2015 National Software Engineering Conference (NSEC)\",\"volume\":\"43 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2015 National Software Engineering Conference (NSEC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/NSEC.2015.7396343\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 National Software Engineering Conference (NSEC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/NSEC.2015.7396343","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

摘要

随着互联网和移动电话使用量的增加，这两个地区的垃圾邮件数量也有所增加。这两个领域的垃圾邮件都是一个日益严重的威胁，有时会造成巨大的财务损失以及数据/机密性损失。因此，需要采取措施阻止这两种媒体上的这些垃圾邮件。本文分析了目前在手机短信垃圾邮件过滤中使用的各种技术。短信的内容在本质上是独一无二的，所以有些技术可能有效，而有些则可能无效。本文讨论了一些常用的算法和技术。此外，我们使用机器学习算法对罗马乌尔都语文本消息进行了自动垃圾邮件过滤，并在人工管理的8449条消息语料库上实现了92.2%的准确率。SMS语料库也可用于未来的研究工作。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Contributions to the study of bi-lingual Roman Urdu SMS spam filtering

With the increased usage of internet and mobile phones, number of spams has also increased in both these areas. The Spam in both these areas is an increasing threat and sometimes cause huge financial as well as data/confidentiality loss. Therefore, actions need to be taken to stop these spams on both media. This paper analyses various techniques that are currently being used in Spam filtering in the context of mobile text messages. The contents of SMS are unique in nature so some techniques might be effective while some might not be. Some of mostly used algorithms and techniques are discussed in this paper. Furthermore, we have performed automatic spam filtering using machine learning algorithms on Roman Urdu text messages and achieved an accuracy of 92.2% on a manually curated corpus of 8449 messages. The SMS corpus has also been made available for future research works.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2015 National Software Engineering Conference (NSEC)

自引率

0.00%

发文量