利用百分比重叠特征选择提高假账户分类器的性能

Aris Tjahyanto, Rivanda Putra Pratama, A. M. Shiddiqi
{"title":"利用百分比重叠特征选择提高假账户分类器的性能","authors":"Aris Tjahyanto, Rivanda Putra Pratama, A. M. Shiddiqi","doi":"10.11591/ijai.v13.i2.pp1585-1595","DOIUrl":null,"url":null,"abstract":"Feature selection plays a crucial role in the development of high-performance classification models. We propose an innovative method for detecting fake accounts. This method leverages the percentage overlap technique to refine feature selection. We introduce our technique upon earlier work that showcased the enhanced efficacy of the Naïve Bayesian classifier through dataset normalization. Our study employs a dataset of account profiles sourced from Twitter, which we normalize using the Min-Max method. We analyze the results through a series of comprehensive experiments involving diverse classification algorithms—such as Naïve Bayes, decision tree, k-nearest neighbors (KNN), deep learning, and support vector machines (SVM). Our experimental results demonstrate a 100% accuracy achieved by the SVM and deep learning classifiers. The results are attributed to the percentage overlap technique, which facilitates the identification of four highly informative features. These findings outperform models with more extensive feature sets, underscoring the efficacy of our approach.","PeriodicalId":507934,"journal":{"name":"IAES International Journal of Artificial Intelligence (IJ-AI)","volume":"10 7","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Improved performance of fake account classifiers with percentage overlap features selection\",\"authors\":\"Aris Tjahyanto, Rivanda Putra Pratama, A. M. Shiddiqi\",\"doi\":\"10.11591/ijai.v13.i2.pp1585-1595\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Feature selection plays a crucial role in the development of high-performance classification models. We propose an innovative method for detecting fake accounts. This method leverages the percentage overlap technique to refine feature selection. We introduce our technique upon earlier work that showcased the enhanced efficacy of the Naïve Bayesian classifier through dataset normalization. Our study employs a dataset of account profiles sourced from Twitter, which we normalize using the Min-Max method. We analyze the results through a series of comprehensive experiments involving diverse classification algorithms—such as Naïve Bayes, decision tree, k-nearest neighbors (KNN), deep learning, and support vector machines (SVM). Our experimental results demonstrate a 100% accuracy achieved by the SVM and deep learning classifiers. The results are attributed to the percentage overlap technique, which facilitates the identification of four highly informative features. These findings outperform models with more extensive feature sets, underscoring the efficacy of our approach.\",\"PeriodicalId\":507934,\"journal\":{\"name\":\"IAES International Journal of Artificial Intelligence (IJ-AI)\",\"volume\":\"10 7\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-06-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IAES International Journal of Artificial Intelligence (IJ-AI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.11591/ijai.v13.i2.pp1585-1595\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IAES International Journal of Artificial Intelligence (IJ-AI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.11591/ijai.v13.i2.pp1585-1595","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

在开发高性能分类模型的过程中,特征选择起着至关重要的作用。我们提出了一种检测假账户的创新方法。该方法利用百分比重叠技术来完善特征选择。我们的技术是在早期工作的基础上提出的,早期工作展示了通过数据集规范化提高奈夫贝叶斯分类器的功效。我们的研究采用了来自 Twitter 的账户配置文件数据集,并使用 Min-Max 方法对其进行归一化处理。我们通过一系列综合实验对结果进行了分析,这些实验涉及多种分类算法,如奈夫贝叶斯、决策树、k-近邻(KNN)、深度学习和支持向量机(SVM)。实验结果表明,SVM 和深度学习分类器的准确率达到了 100%。这些结果归功于百分比重叠技术,该技术有助于识别四个高信息量特征。这些结果优于具有更广泛特征集的模型,凸显了我们方法的功效。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Improved performance of fake account classifiers with percentage overlap features selection
Feature selection plays a crucial role in the development of high-performance classification models. We propose an innovative method for detecting fake accounts. This method leverages the percentage overlap technique to refine feature selection. We introduce our technique upon earlier work that showcased the enhanced efficacy of the Naïve Bayesian classifier through dataset normalization. Our study employs a dataset of account profiles sourced from Twitter, which we normalize using the Min-Max method. We analyze the results through a series of comprehensive experiments involving diverse classification algorithms—such as Naïve Bayes, decision tree, k-nearest neighbors (KNN), deep learning, and support vector machines (SVM). Our experimental results demonstrate a 100% accuracy achieved by the SVM and deep learning classifiers. The results are attributed to the percentage overlap technique, which facilitates the identification of four highly informative features. These findings outperform models with more extensive feature sets, underscoring the efficacy of our approach.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
FinTech forecasting using an evolving connectionist system for lenders and borrowers: ecosystem behavior Dealing imbalance dataset problem in sentiment analysis of recession in Indonesia A survey on planet leaf disease identification and classification by various machine-learning technique Effect of dataset distribution on automatic road extraction in very high-resolution orthophoto using DeepLab V3+ Feature selection techniques for microarray dataset: a review
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1