{"title":"A game theoretic decision forest for feature selection and classification","authors":"Mihai-Alexandru Suciu, Rodica Ioana Lung","doi":"10.1093/jigpal/jzae049","DOIUrl":null,"url":null,"abstract":"\n Classification and feature selection are two of the most intertwined problems in machine learning. Decision trees (DTs) are straightforward models that address these problems offering also the advantage of explainability. However, solutions that are based on them are either tailored for the problem they solve or their performance is dependent on the split criterion used. A game-theoretic decision forest model is proposed to approach both issues. DTs in the forest use a splitting mechanism based on the Nash equilibrium concept. A feature importance measure is computed after each tree is built. The selection of features for the next trees is based on the information provided by this measure. To make predictions, training data is aggregated from all leaves that contain the data tested, and logistic regression is further used. Numerical experiments illustrate the efficiency of the approach. A real data example that studies country income groups and world development indicators using the proposed approach is presented.","PeriodicalId":0,"journal":{"name":"","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1093/jigpal/jzae049","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Classification and feature selection are two of the most intertwined problems in machine learning. Decision trees (DTs) are straightforward models that address these problems offering also the advantage of explainability. However, solutions that are based on them are either tailored for the problem they solve or their performance is dependent on the split criterion used. A game-theoretic decision forest model is proposed to approach both issues. DTs in the forest use a splitting mechanism based on the Nash equilibrium concept. A feature importance measure is computed after each tree is built. The selection of features for the next trees is based on the information provided by this measure. To make predictions, training data is aggregated from all leaves that contain the data tested, and logistic regression is further used. Numerical experiments illustrate the efficiency of the approach. A real data example that studies country income groups and world development indicators using the proposed approach is presented.