{"title":"Random forests in the zero to k inflated Power series populations","authors":"H. Saboori, Mahdi Doostparast","doi":"10.19139/soic-2310-5070-1773","DOIUrl":null,"url":null,"abstract":"Tree-based algorithms are a class of useful, versatile, and popular tools in data mining and machine learning.Indeed, tree aggregation methods, such as random forests, are among the most powerful approaches to boostthe performance of predictions. In this article, we apply tree-based methods to model and predict discretedata, using a highly flexible model. Inflation may occur in discrete data at some points. Inflation can beat points as zero, one or the other. We may even have inflation at two points or more. We use models forinflated data sets based on a common discrete family (the Power series models). The Power series modelsare one of the most famous families used in such models. This family includes common discrete models suchas the Poisson, Negative Binomial, Multinomial, and Logarithmic series models.The main idea of this article is to use zero to k (k = 0, 1, . . .) inflated regression models based on the familyof power series to fit decision regression trees and random forests. An important point of these models isthat they can be used not only for inflated discrete data but also for non-inflated discrete data. Indeed thismodel can be used for a wide range of discrete data sets.","PeriodicalId":131002,"journal":{"name":"Statistics, Optimization & Information Computing","volume":"119 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Statistics, Optimization & Information Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.19139/soic-2310-5070-1773","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Tree-based algorithms are a class of useful, versatile, and popular tools in data mining and machine learning.Indeed, tree aggregation methods, such as random forests, are among the most powerful approaches to boostthe performance of predictions. In this article, we apply tree-based methods to model and predict discretedata, using a highly flexible model. Inflation may occur in discrete data at some points. Inflation can beat points as zero, one or the other. We may even have inflation at two points or more. We use models forinflated data sets based on a common discrete family (the Power series models). The Power series modelsare one of the most famous families used in such models. This family includes common discrete models suchas the Poisson, Negative Binomial, Multinomial, and Logarithmic series models.The main idea of this article is to use zero to k (k = 0, 1, . . .) inflated regression models based on the familyof power series to fit decision regression trees and random forests. An important point of these models isthat they can be used not only for inflated discrete data but also for non-inflated discrete data. Indeed thismodel can be used for a wide range of discrete data sets.