不同支持向量机核性能的比较分析

2022 5th Information Technology for Education and Development (ITED) Pub Date : 2022-11-01 DOI:10.1109/ITED56637.2022.10051564

A. Kuyoro, Sheriff Alimi, O. Awodele

{"title":"不同支持向量机核性能的比较分析","authors":"A. Kuyoro, Sheriff Alimi, O. Awodele","doi":"10.1109/ITED56637.2022.10051564","DOIUrl":null,"url":null,"abstract":"Support Vector Machine (SVM) in dealing with a classification problem, separates classes using decision boundaries with the primary objective of establishing a large margin between support vectors of the respective class groups; it utilizes kernels to achieve non-linear decision boundaries. This current work examines the performance of four SVM kernels (Sigmoid, Linear, Radial Basis Function (RBF) and Polynomial kernel functions) in addressing classification problems using two datasets from two domains. The two datasets are the Knowledge Discovery in Dataset (KDD) and a set of features extracted from voiced and unvoiced frames. The Polynomial kernel function had the best classification performance on the KDD dataset with accuracy and precision of 99.77% and 99.8% respectively but recorded the worst performance against the voice-feature dataset with an accuracy of 74.96%. Inductively, the polynomial kernel can be best suited for some classification datasets but can return the worst classification performance on another classification dataset. The RBF shows consistent high performance across the two data domains with accuracies of 96.04% and 99.77% and can be considered a general-purpose kernel guaranteed to yield satisfactory classification performance regardless of the dataset type or data domains. The performance of polynomial kernels over the two separate datasets supports the “No Free Launch Theorem”, which when applied to machine learning, means that if an algorithm performs well over a class of problem, it may have worse performance on other class of problem. This implies that there might not be one specific machine learning algorithm that gives the best possible performance for a set of problems, it is therefore important for researchers to try out various algorithms before concluding on the best possible result on any dataset.","PeriodicalId":246041,"journal":{"name":"2022 5th Information Technology for Education and Development (ITED)","volume":"52 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Comparative Analysis of the Performance of Various Support Vector Machine kernels\",\"authors\":\"A. Kuyoro, Sheriff Alimi, O. Awodele\",\"doi\":\"10.1109/ITED56637.2022.10051564\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Support Vector Machine (SVM) in dealing with a classification problem, separates classes using decision boundaries with the primary objective of establishing a large margin between support vectors of the respective class groups; it utilizes kernels to achieve non-linear decision boundaries. This current work examines the performance of four SVM kernels (Sigmoid, Linear, Radial Basis Function (RBF) and Polynomial kernel functions) in addressing classification problems using two datasets from two domains. The two datasets are the Knowledge Discovery in Dataset (KDD) and a set of features extracted from voiced and unvoiced frames. The Polynomial kernel function had the best classification performance on the KDD dataset with accuracy and precision of 99.77% and 99.8% respectively but recorded the worst performance against the voice-feature dataset with an accuracy of 74.96%. Inductively, the polynomial kernel can be best suited for some classification datasets but can return the worst classification performance on another classification dataset. The RBF shows consistent high performance across the two data domains with accuracies of 96.04% and 99.77% and can be considered a general-purpose kernel guaranteed to yield satisfactory classification performance regardless of the dataset type or data domains. The performance of polynomial kernels over the two separate datasets supports the “No Free Launch Theorem”, which when applied to machine learning, means that if an algorithm performs well over a class of problem, it may have worse performance on other class of problem. This implies that there might not be one specific machine learning algorithm that gives the best possible performance for a set of problems, it is therefore important for researchers to try out various algorithms before concluding on the best possible result on any dataset.\",\"PeriodicalId\":246041,\"journal\":{\"name\":\"2022 5th Information Technology for Education and Development (ITED)\",\"volume\":\"52 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 5th Information Technology for Education and Development (ITED)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ITED56637.2022.10051564\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 5th Information Technology for Education and Development (ITED)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ITED56637.2022.10051564","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

支持向量机(SVM)在处理分类问题时，使用决策边界来划分类别，其主要目标是在各自类别组的支持向量之间建立较大的余量;它利用核函数来实现非线性决策边界。目前的工作考察了四个支持向量机核(Sigmoid、线性、径向基函数(RBF)和多项式核函数)在使用来自两个领域的两个数据集解决分类问题方面的性能。这两个数据集是知识发现数据集(KDD)和一组从浊音和非浊音帧中提取的特征。多项式核函数在KDD数据集上的分类性能最好，准确率和精密度分别为99.77%和99.8%，但在语音特征数据集上的分类性能最差，准确率为74.96%。归纳起来，多项式核可能最适合某些分类数据集，但在另一个分类数据集上可能返回最差的分类性能。RBF在两个数据域上表现出一致的高性能，准确率分别为96.04%和99.77%，无论数据集类型或数据域如何，RBF都可以被认为是保证产生满意分类性能的通用内核。多项式核在两个独立数据集上的性能支持“无自由启动定理”，当应用于机器学习时，这意味着如果一个算法在一类问题上表现良好，那么它在其他类问题上的性能可能会更差。这意味着可能没有一种特定的机器学习算法可以为一组问题提供最佳性能，因此研究人员在对任何数据集得出最佳结果之前尝试各种算法是很重要的。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Comparative Analysis of the Performance of Various Support Vector Machine kernels

Support Vector Machine (SVM) in dealing with a classification problem, separates classes using decision boundaries with the primary objective of establishing a large margin between support vectors of the respective class groups; it utilizes kernels to achieve non-linear decision boundaries. This current work examines the performance of four SVM kernels (Sigmoid, Linear, Radial Basis Function (RBF) and Polynomial kernel functions) in addressing classification problems using two datasets from two domains. The two datasets are the Knowledge Discovery in Dataset (KDD) and a set of features extracted from voiced and unvoiced frames. The Polynomial kernel function had the best classification performance on the KDD dataset with accuracy and precision of 99.77% and 99.8% respectively but recorded the worst performance against the voice-feature dataset with an accuracy of 74.96%. Inductively, the polynomial kernel can be best suited for some classification datasets but can return the worst classification performance on another classification dataset. The RBF shows consistent high performance across the two data domains with accuracies of 96.04% and 99.77% and can be considered a general-purpose kernel guaranteed to yield satisfactory classification performance regardless of the dataset type or data domains. The performance of polynomial kernels over the two separate datasets supports the “No Free Launch Theorem”, which when applied to machine learning, means that if an algorithm performs well over a class of problem, it may have worse performance on other class of problem. This implies that there might not be one specific machine learning algorithm that gives the best possible performance for a set of problems, it is therefore important for researchers to try out various algorithms before concluding on the best possible result on any dataset.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2022 5th Information Technology for Education and Development (ITED)

自引率

0.00%

发文量