{"title":"训练核树的方法","authors":"D. A. Devyatkin, O. G. Grigoriev","doi":"10.3103/s0147688223050040","DOIUrl":null,"url":null,"abstract":"<h3 data-test=\"abstract-sub-heading\">Abstract</h3><p>Axis-parallel decision trees perform poorly on that multidimensional sparse data that are frequently input in many tasks. A straightforward solution is to create decision trees that have oblique splits; however, most training approaches have low performance. These models can easily overfit, so they should be combined with a random ensemble. This paper proposes an algorithm to train kernel decision trees. At each stump, the algorithm optimizes a loss function with a margin rescaling approach that simultaneously optimizes the margin and impurity criteria. We performed an experimental evaluation of several tasks, such as studying the reaction of social media users and image recognition. The experimental results show that the proposed algorithm trains ensembles that outperform other oblique or kernel forests in many datasets.</p>","PeriodicalId":43962,"journal":{"name":"Scientific and Technical Information Processing","volume":"41 1","pages":""},"PeriodicalIF":0.4000,"publicationDate":"2024-03-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Method of Training a Kernel Tree\",\"authors\":\"D. A. Devyatkin, O. G. Grigoriev\",\"doi\":\"10.3103/s0147688223050040\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<h3 data-test=\\\"abstract-sub-heading\\\">Abstract</h3><p>Axis-parallel decision trees perform poorly on that multidimensional sparse data that are frequently input in many tasks. A straightforward solution is to create decision trees that have oblique splits; however, most training approaches have low performance. These models can easily overfit, so they should be combined with a random ensemble. This paper proposes an algorithm to train kernel decision trees. At each stump, the algorithm optimizes a loss function with a margin rescaling approach that simultaneously optimizes the margin and impurity criteria. We performed an experimental evaluation of several tasks, such as studying the reaction of social media users and image recognition. The experimental results show that the proposed algorithm trains ensembles that outperform other oblique or kernel forests in many datasets.</p>\",\"PeriodicalId\":43962,\"journal\":{\"name\":\"Scientific and Technical Information Processing\",\"volume\":\"41 1\",\"pages\":\"\"},\"PeriodicalIF\":0.4000,\"publicationDate\":\"2024-03-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Scientific and Technical Information Processing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.3103/s0147688223050040\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"INFORMATION SCIENCE & LIBRARY SCIENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Scientific and Technical Information Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3103/s0147688223050040","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"INFORMATION SCIENCE & LIBRARY SCIENCE","Score":null,"Total":0}
Axis-parallel decision trees perform poorly on that multidimensional sparse data that are frequently input in many tasks. A straightforward solution is to create decision trees that have oblique splits; however, most training approaches have low performance. These models can easily overfit, so they should be combined with a random ensemble. This paper proposes an algorithm to train kernel decision trees. At each stump, the algorithm optimizes a loss function with a margin rescaling approach that simultaneously optimizes the margin and impurity criteria. We performed an experimental evaluation of several tasks, such as studying the reaction of social media users and image recognition. The experimental results show that the proposed algorithm trains ensembles that outperform other oblique or kernel forests in many datasets.
期刊介绍:
Scientific and Technical Information Processing is a refereed journal that covers all aspects of management and use of information technology in libraries and archives, information centres, and the information industry in general. Emphasis is on practical applications of new technologies and techniques for information analysis and processing.