网页垃圾邮件:网页语言对垃圾邮件检测特征的影响研究

A. Alarifi, Mansour Alsaleh
{"title":"网页垃圾邮件:网页语言对垃圾邮件检测特征的影响研究","authors":"A. Alarifi, Mansour Alsaleh","doi":"10.1109/ICMLA.2012.229","DOIUrl":null,"url":null,"abstract":"Although search engines have deployed various techniques to detect and filter out Web spam, Web stammers continue to develop new tactics to influence the result of search engines ranking algorithms, for the purpose of obtaining an undeservedly high ranks. In this paper, we study the effect of the page language on the spam detection features. We examine how the distribution of a set of selected detection features changes according to the page language. Also, we study the effect of the page language on the detection rate of a given classifier using a selected set of detection features. The analysis results show that selecting suitable features for a classifier that segregates spam pages depends heavily on the language of the examined Web page, due in part to the different set of Web spam mechanisms used by each type of stammers.","PeriodicalId":157399,"journal":{"name":"2012 11th International Conference on Machine Learning and Applications","volume":"2 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"14","resultStr":"{\"title\":\"Web Spam: A Study of the Page Language Effect on the Spam Detection Features\",\"authors\":\"A. Alarifi, Mansour Alsaleh\",\"doi\":\"10.1109/ICMLA.2012.229\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Although search engines have deployed various techniques to detect and filter out Web spam, Web stammers continue to develop new tactics to influence the result of search engines ranking algorithms, for the purpose of obtaining an undeservedly high ranks. In this paper, we study the effect of the page language on the spam detection features. We examine how the distribution of a set of selected detection features changes according to the page language. Also, we study the effect of the page language on the detection rate of a given classifier using a selected set of detection features. The analysis results show that selecting suitable features for a classifier that segregates spam pages depends heavily on the language of the examined Web page, due in part to the different set of Web spam mechanisms used by each type of stammers.\",\"PeriodicalId\":157399,\"journal\":{\"name\":\"2012 11th International Conference on Machine Learning and Applications\",\"volume\":\"2 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2012-12-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"14\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2012 11th International Conference on Machine Learning and Applications\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICMLA.2012.229\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 11th International Conference on Machine Learning and Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICMLA.2012.229","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 14

摘要

尽管搜索引擎已经部署了各种技术来检测和过滤Web垃圾邮件,但Web结结巴巴者继续开发新的策略来影响搜索引擎排名算法的结果,以获得不应得的高排名。本文研究了页面语言对垃圾邮件检测特性的影响。我们研究了一组选定的检测特征的分布如何根据页面语言变化。此外,我们使用一组选定的检测特征研究了页面语言对给定分类器检测率的影响。分析结果表明,为分离垃圾邮件页面的分类器选择合适的特征在很大程度上取决于所检查的Web页面的语言,部分原因是每种类型的口吃者使用不同的Web垃圾邮件机制集。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Web Spam: A Study of the Page Language Effect on the Spam Detection Features
Although search engines have deployed various techniques to detect and filter out Web spam, Web stammers continue to develop new tactics to influence the result of search engines ranking algorithms, for the purpose of obtaining an undeservedly high ranks. In this paper, we study the effect of the page language on the spam detection features. We examine how the distribution of a set of selected detection features changes according to the page language. Also, we study the effect of the page language on the detection rate of a given classifier using a selected set of detection features. The analysis results show that selecting suitable features for a classifier that segregates spam pages depends heavily on the language of the examined Web page, due in part to the different set of Web spam mechanisms used by each type of stammers.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Excitation Current Forecasting for Reactive Power Compensation in Synchronous Motors: A Data Mining Approach Deep Structure Learning: Beyond Connectionist Approaches Using Twitter Content to Predict Psychopathy A Hybrid Approach to Coping with High Dimensionality and Class Imbalance for Software Defect Prediction O-linked Glycosylation Site Prediction Using Ensemble of Graphical Models
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1