基于距离分类数据的核学习方法

Lifei Chen, G. Guo, Shengrui Wang, Xiangzeng Kong
{"title":"基于距离分类数据的核学习方法","authors":"Lifei Chen, G. Guo, Shengrui Wang, Xiangzeng Kong","doi":"10.1109/UKCI.2014.6930159","DOIUrl":null,"url":null,"abstract":"Kernel-based methods have become popular in machine learning; however, they are typically designed for numeric data. These methods are established in vector spaces, which are undefined for categorical data. In this paper, we propose a new kind of kernel trick, showing that mapping of categorical samples into kernel spaces can be alternatively described as assigning a kernel-based weight to each categorical attribute of the input space, so that common distance measures can be employed. A data-driven approach is then proposed to kernel bandwidth selection by optimizing feature weights. We also make use of the kernel-based distance measure to effectively extend nearest-neighbor classification to classify categorical data. Experimental results on real-world data sets show the outstanding performance of this approach compared to that obtained in the original input space.","PeriodicalId":315044,"journal":{"name":"2014 14th UK Workshop on Computational Intelligence (UKCI)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-10-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"Kernel learning method for distance-based classification of categorical data\",\"authors\":\"Lifei Chen, G. Guo, Shengrui Wang, Xiangzeng Kong\",\"doi\":\"10.1109/UKCI.2014.6930159\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Kernel-based methods have become popular in machine learning; however, they are typically designed for numeric data. These methods are established in vector spaces, which are undefined for categorical data. In this paper, we propose a new kind of kernel trick, showing that mapping of categorical samples into kernel spaces can be alternatively described as assigning a kernel-based weight to each categorical attribute of the input space, so that common distance measures can be employed. A data-driven approach is then proposed to kernel bandwidth selection by optimizing feature weights. We also make use of the kernel-based distance measure to effectively extend nearest-neighbor classification to classify categorical data. Experimental results on real-world data sets show the outstanding performance of this approach compared to that obtained in the original input space.\",\"PeriodicalId\":315044,\"journal\":{\"name\":\"2014 14th UK Workshop on Computational Intelligence (UKCI)\",\"volume\":\"17 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-10-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2014 14th UK Workshop on Computational Intelligence (UKCI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/UKCI.2014.6930159\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 14th UK Workshop on Computational Intelligence (UKCI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/UKCI.2014.6930159","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5

摘要

基于核的方法在机器学习中很流行;然而,它们通常是为数值数据设计的。这些方法是在向量空间中建立的,这对于分类数据是未定义的。在本文中,我们提出了一种新的核技巧,表明将分类样本映射到核空间可以被描述为为输入空间的每个分类属性分配一个基于核的权重,从而可以使用公共距离度量。然后提出了一种数据驱动的方法,通过优化特征权重来选择内核带宽。我们还利用基于核的距离度量来有效地扩展最近邻分类来对分类数据进行分类。在真实数据集上的实验结果表明,与在原始输入空间中获得的结果相比,该方法具有出色的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Kernel learning method for distance-based classification of categorical data
Kernel-based methods have become popular in machine learning; however, they are typically designed for numeric data. These methods are established in vector spaces, which are undefined for categorical data. In this paper, we propose a new kind of kernel trick, showing that mapping of categorical samples into kernel spaces can be alternatively described as assigning a kernel-based weight to each categorical attribute of the input space, so that common distance measures can be employed. A data-driven approach is then proposed to kernel bandwidth selection by optimizing feature weights. We also make use of the kernel-based distance measure to effectively extend nearest-neighbor classification to classify categorical data. Experimental results on real-world data sets show the outstanding performance of this approach compared to that obtained in the original input space.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
PermGA algorithm for a sequential optimal space filling DoE framework Modeling neural plasticity in echo state networks for time series prediction Hybridisation of decomposition and GRASP for combinatorial multiobjective optimisation Adaptive mutation in dynamic environments Automatic image annotation with long distance spatial-context
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1