用粒子群优化法优化 K-近邻中的超参数 K

Muhammad Rizki, Arief Hermawan, Donny Avianto
{"title":"用粒子群优化法优化 K-近邻中的超参数 K","authors":"Muhammad Rizki, Arief Hermawan, Donny Avianto","doi":"10.30595/juita.v12i1.20688","DOIUrl":null,"url":null,"abstract":"This study aims to enhance the performance of the K-Nearest Neighbors (KNN) algorithm by optimizing the hyperparameter K using the Particle Swarm Optimization (PSO) algorithm. In contrast to prior research, which typically focuses on a single dataset, this study seeks to demonstrate that PSO can effectively optimize KNN hyperparameters across diverse datasets. Three datasets from different domains are utilized: Iris, Wine, and Breast Cancer, each featuring distinct classification types and classes. Furthermore, this research endeavors to establish that PSO can operate optimally with both Manhattan and Euclidean distance metrics. Prior to optimization, experiments with default K values (3, 5, and 7) were conducted to observe KNN behavior on each dataset. Initial results reveal stable accuracy in the iris dataset, while the wine and breast cancer datasets exhibit a decrease in accuracy at K=3, attributed to attribute complexity. The hyperparameter K optimization process with PSO yields a significant increase in accuracy, particularly in the wine dataset, where accuracy improves by 6.28% with the Manhattan matrix. The enhanced accuracy in the optimized KNN algorithm demonstrates the effectiveness of PSO in overcoming KNN constraints. Although the accuracy increase for the iris dataset is not as pronounced, this research provides insight that optimizing the hyperparameter K can yield positive results, even for datasets with initially good performance. A recommendation for future research is to conduct similar experiments with different algorithms, such as Support Vector Machine or Random Forest, to further evaluate PSO's ability to optimize the iris, wine, and breast cancer datasets.","PeriodicalId":151254,"journal":{"name":"JUITA : Jurnal Informatika","volume":"71 3","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Optimization of Hyperparameter K in K-Nearest Neighbor Using Particle Swarm Optimization\",\"authors\":\"Muhammad Rizki, Arief Hermawan, Donny Avianto\",\"doi\":\"10.30595/juita.v12i1.20688\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This study aims to enhance the performance of the K-Nearest Neighbors (KNN) algorithm by optimizing the hyperparameter K using the Particle Swarm Optimization (PSO) algorithm. In contrast to prior research, which typically focuses on a single dataset, this study seeks to demonstrate that PSO can effectively optimize KNN hyperparameters across diverse datasets. Three datasets from different domains are utilized: Iris, Wine, and Breast Cancer, each featuring distinct classification types and classes. Furthermore, this research endeavors to establish that PSO can operate optimally with both Manhattan and Euclidean distance metrics. Prior to optimization, experiments with default K values (3, 5, and 7) were conducted to observe KNN behavior on each dataset. Initial results reveal stable accuracy in the iris dataset, while the wine and breast cancer datasets exhibit a decrease in accuracy at K=3, attributed to attribute complexity. The hyperparameter K optimization process with PSO yields a significant increase in accuracy, particularly in the wine dataset, where accuracy improves by 6.28% with the Manhattan matrix. The enhanced accuracy in the optimized KNN algorithm demonstrates the effectiveness of PSO in overcoming KNN constraints. Although the accuracy increase for the iris dataset is not as pronounced, this research provides insight that optimizing the hyperparameter K can yield positive results, even for datasets with initially good performance. A recommendation for future research is to conduct similar experiments with different algorithms, such as Support Vector Machine or Random Forest, to further evaluate PSO's ability to optimize the iris, wine, and breast cancer datasets.\",\"PeriodicalId\":151254,\"journal\":{\"name\":\"JUITA : Jurnal Informatika\",\"volume\":\"71 3\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-05-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"JUITA : Jurnal Informatika\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.30595/juita.v12i1.20688\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"JUITA : Jurnal Informatika","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.30595/juita.v12i1.20688","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

本研究旨在利用粒子群优化(PSO)算法优化超参数 K,从而提高 K-近邻(KNN)算法的性能。以往的研究通常只关注单一数据集,与此不同的是,本研究试图证明 PSO 可以有效优化不同数据集的 KNN 超参数。本研究使用了三个不同领域的数据集:虹膜、葡萄酒和乳腺癌,每个数据集都具有不同的分类类型和类别。此外,本研究还努力证明 PSO 可以在曼哈顿距离和欧氏距离指标下实现最佳运行。在优化之前,我们使用默认 K 值(3、5 和 7)进行了实验,以观察 KNN 在每个数据集上的表现。初步结果显示,虹膜数据集的准确率比较稳定,而葡萄酒和乳腺癌数据集的准确率在 K=3 时有所下降,这归因于属性的复杂性。使用 PSO 优化超参数 K 的过程显著提高了准确率,尤其是在葡萄酒数据集中,使用曼哈顿矩阵后准确率提高了 6.28%。优化后的 KNN 算法准确率的提高证明了 PSO 在克服 KNN 约束方面的有效性。虽然虹膜数据集的准确率提高并不明显,但这项研究提供了一个启示,即优化超参数 K 可以产生积极的结果,即使对于最初性能良好的数据集也是如此。对未来研究的建议是使用不同的算法(如支持向量机或随机森林)进行类似的实验,以进一步评估 PSO 优化虹膜、葡萄酒和乳腺癌数据集的能力。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Optimization of Hyperparameter K in K-Nearest Neighbor Using Particle Swarm Optimization
This study aims to enhance the performance of the K-Nearest Neighbors (KNN) algorithm by optimizing the hyperparameter K using the Particle Swarm Optimization (PSO) algorithm. In contrast to prior research, which typically focuses on a single dataset, this study seeks to demonstrate that PSO can effectively optimize KNN hyperparameters across diverse datasets. Three datasets from different domains are utilized: Iris, Wine, and Breast Cancer, each featuring distinct classification types and classes. Furthermore, this research endeavors to establish that PSO can operate optimally with both Manhattan and Euclidean distance metrics. Prior to optimization, experiments with default K values (3, 5, and 7) were conducted to observe KNN behavior on each dataset. Initial results reveal stable accuracy in the iris dataset, while the wine and breast cancer datasets exhibit a decrease in accuracy at K=3, attributed to attribute complexity. The hyperparameter K optimization process with PSO yields a significant increase in accuracy, particularly in the wine dataset, where accuracy improves by 6.28% with the Manhattan matrix. The enhanced accuracy in the optimized KNN algorithm demonstrates the effectiveness of PSO in overcoming KNN constraints. Although the accuracy increase for the iris dataset is not as pronounced, this research provides insight that optimizing the hyperparameter K can yield positive results, even for datasets with initially good performance. A recommendation for future research is to conduct similar experiments with different algorithms, such as Support Vector Machine or Random Forest, to further evaluate PSO's ability to optimize the iris, wine, and breast cancer datasets.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Enhancing Information Technology Adoption Potential in MSMEs: a Conceptual Model Based on TOE Framework Improving Stroke Detection with Hybrid Sampling and Cascade Generalization Comparative Study of Predictive Classification Models on Data with Severely Imbalanced Predictors Image Classification of Room Tidiness Using VGGNet with Data Augmentation Number of Cyber Attacks Predicted With Deep Learning Based LSTM Model
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1