Random forests in the zero to k inflated Power series populations

Statistics, Optimization & Information Computing Pub Date : 2023-08-03 DOI:10.19139/soic-2310-5070-1773

H. Saboori, Mahdi Doostparast

{"title":"Random forests in the zero to k inflated Power series populations","authors":"H. Saboori, Mahdi Doostparast","doi":"10.19139/soic-2310-5070-1773","DOIUrl":null,"url":null,"abstract":"Tree-based algorithms are a class of useful, versatile, and popular tools in data mining and machine learning.Indeed, tree aggregation methods, such as random forests, are among the most powerful approaches to boostthe performance of predictions. In this article, we apply tree-based methods to model and predict discretedata, using a highly flexible model. Inflation may occur in discrete data at some points. Inflation can beat points as zero, one or the other. We may even have inflation at two points or more. We use models forinflated data sets based on a common discrete family (the Power series models). The Power series modelsare one of the most famous families used in such models. This family includes common discrete models suchas the Poisson, Negative Binomial, Multinomial, and Logarithmic series models.The main idea of this article is to use zero to k (k = 0, 1, . . .) inflated regression models based on the familyof power series to fit decision regression trees and random forests. An important point of these models isthat they can be used not only for inflated discrete data but also for non-inflated discrete data. Indeed thismodel can be used for a wide range of discrete data sets.","PeriodicalId":131002,"journal":{"name":"Statistics, Optimization & Information Computing","volume":"119 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Statistics, Optimization & Information Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.19139/soic-2310-5070-1773","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Tree-based algorithms are a class of useful, versatile, and popular tools in data mining and machine learning.Indeed, tree aggregation methods, such as random forests, are among the most powerful approaches to boostthe performance of predictions. In this article, we apply tree-based methods to model and predict discretedata, using a highly flexible model. Inflation may occur in discrete data at some points. Inflation can beat points as zero, one or the other. We may even have inflation at two points or more. We use models forinflated data sets based on a common discrete family (the Power series models). The Power series modelsare one of the most famous families used in such models. This family includes common discrete models suchas the Poisson, Negative Binomial, Multinomial, and Logarithmic series models.The main idea of this article is to use zero to k (k = 0, 1, . . .) inflated regression models based on the familyof power series to fit decision regression trees and random forests. An important point of these models isthat they can be used not only for inflated discrete data but also for non-inflated discrete data. Indeed thismodel can be used for a wide range of discrete data sets.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

0到k膨胀幂级数种群中的随机森林

基于树的算法是数据挖掘和机器学习中一类有用的、通用的和流行的工具。事实上，树聚合方法，如随机森林，是提高预测性能的最有力的方法之一。在本文中，我们使用高度灵活的模型，应用基于树的方法来建模和预测离散数据。通货膨胀可能在某些点上出现在离散的数据中。通货膨胀可以超过零点，也可以低于零点。我们甚至可能有两个百分点甚至更高的通货膨胀。我们使用基于常见离散族(幂级数模型)的膨胀数据集模型。Power系列模型是此类模型中使用的最著名的家族之一。这个家族包括常见的离散模型，如泊松，负二项式，多项和对数系列模型。本文的主要思想是使用基于幂级数族的0到k (k = 0,1，…)膨胀回归模型来拟合决策回归树和随机森林。这些模型的一个重要特点是它们不仅可以用于膨胀的离散数据，也可以用于非膨胀的离散数据。事实上，这个模型可以用于广泛的离散数据集。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Statistics, Optimization & Information Computing

自引率

0.00%

发文量