训练核树的方法

IF 0.4 Q4 INFORMATION SCIENCE & LIBRARY SCIENCE Scientific and Technical Information Processing Pub Date : 2024-03-05 DOI:10.3103/s0147688223050040

D. A. Devyatkin, O. G. Grigoriev

{"title":"训练核树的方法","authors":"D. A. Devyatkin, O. G. Grigoriev","doi":"10.3103/s0147688223050040","DOIUrl":null,"url":null,"abstract":"<h3 data-test=\"abstract-sub-heading\">Abstract</h3><p>Axis-parallel decision trees perform poorly on that multidimensional sparse data that are frequently input in many tasks. A straightforward solution is to create decision trees that have oblique splits; however, most training approaches have low performance. These models can easily overfit, so they should be combined with a random ensemble. This paper proposes an algorithm to train kernel decision trees. At each stump, the algorithm optimizes a loss function with a margin rescaling approach that simultaneously optimizes the margin and impurity criteria. We performed an experimental evaluation of several tasks, such as studying the reaction of social media users and image recognition. The experimental results show that the proposed algorithm trains ensembles that outperform other oblique or kernel forests in many datasets.</p>","PeriodicalId":43962,"journal":{"name":"Scientific and Technical Information Processing","volume":"41 1","pages":""},"PeriodicalIF":0.4000,"publicationDate":"2024-03-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Method of Training a Kernel Tree\",\"authors\":\"D. A. Devyatkin, O. G. Grigoriev\",\"doi\":\"10.3103/s0147688223050040\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<h3 data-test=\\\"abstract-sub-heading\\\">Abstract</h3><p>Axis-parallel decision trees perform poorly on that multidimensional sparse data that are frequently input in many tasks. A straightforward solution is to create decision trees that have oblique splits; however, most training approaches have low performance. These models can easily overfit, so they should be combined with a random ensemble. This paper proposes an algorithm to train kernel decision trees. At each stump, the algorithm optimizes a loss function with a margin rescaling approach that simultaneously optimizes the margin and impurity criteria. We performed an experimental evaluation of several tasks, such as studying the reaction of social media users and image recognition. The experimental results show that the proposed algorithm trains ensembles that outperform other oblique or kernel forests in many datasets.</p>\",\"PeriodicalId\":43962,\"journal\":{\"name\":\"Scientific and Technical Information Processing\",\"volume\":\"41 1\",\"pages\":\"\"},\"PeriodicalIF\":0.4000,\"publicationDate\":\"2024-03-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Scientific and Technical Information Processing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.3103/s0147688223050040\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"INFORMATION SCIENCE & LIBRARY SCIENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Scientific and Technical Information Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3103/s0147688223050040","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"INFORMATION SCIENCE & LIBRARY SCIENCE","Score":null,"Total":0}

引用次数: 0

摘要

摘要轴平行决策树在许多任务中经常输入的多维稀疏数据上表现不佳。一个直接的解决方案是创建具有斜分裂的决策树；然而，大多数训练方法的性能都很低。这些模型很容易过拟合，因此应该与随机集合相结合。本文提出了一种训练核决策树的算法。在每个树桩上，该算法通过同时优化边际和杂质标准的边际重缩放方法来优化损失函数。我们对几项任务进行了实验评估，如研究社交媒体用户的反应和图像识别。实验结果表明，在许多数据集中，所提出的算法训练的集合优于其他斜森林或核森林。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

摘要图片

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Method of Training a Kernel Tree

Abstract

Axis-parallel decision trees perform poorly on that multidimensional sparse data that are frequently input in many tasks. A straightforward solution is to create decision trees that have oblique splits; however, most training approaches have low performance. These models can easily overfit, so they should be combined with a random ensemble. This paper proposes an algorithm to train kernel decision trees. At each stump, the algorithm optimizes a loss function with a margin rescaling approach that simultaneously optimizes the margin and impurity criteria. We performed an experimental evaluation of several tasks, such as studying the reaction of social media users and image recognition. The experimental results show that the proposed algorithm trains ensembles that outperform other oblique or kernel forests in many datasets.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Scientific and Technical Information Processing INFORMATION SCIENCE & LIBRARY SCIENCE-

CiteScore

1.00

自引率

42.90%

发文量

期刊介绍： Scientific and Technical Information Processing is a refereed journal that covers all aspects of management and use of information technology in libraries and archives, information centres, and the information industry in general. Emphasis is on practical applications of new technologies and techniques for information analysis and processing.