{"title":"Optimal trimming proportion in regression analysis for non-normal distributions","authors":"Amit Mitra, Pankush Kalgotra","doi":"10.1080/2573234X.2021.2007803","DOIUrl":null,"url":null,"abstract":"ABSTRACT Regression analysis is a widely used modelling tool in business decision making. However, proper application of this methodology requires that certain assumptions, associated with the model, be satisfied. The assumption we focus on is the normality of the response variable, which is directly related to the assumption of normality of the error component. In a variety of fields in business, such as finance, marketing, information systems, operations, and healthcare, the selected dependent variable does not inherently have a normal distribution. In the regression context, where the model parameters and independent variables are assumed to be constant, the distribution of the random error component thus influences the distribution of the dependent variable. Here, we study the impact of symmetric and asymmetric error distributions on the performance of the estimated model parameters. To create robust estimates, through a process of trimming the response variable, we study the effectiveness of the trimmed estimators with respect to the ordinary least squares estimator (OLS) via a simulation procedure. Accordingly, to minimise the ratio of the mean squared error of the trimmed estimator to that of the OLS, a recommendation is developed for the optimal trimming proportion.","PeriodicalId":36417,"journal":{"name":"Journal of Business Analytics","volume":null,"pages":null},"PeriodicalIF":1.7000,"publicationDate":"2021-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Business Analytics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1080/2573234X.2021.2007803","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
ABSTRACT Regression analysis is a widely used modelling tool in business decision making. However, proper application of this methodology requires that certain assumptions, associated with the model, be satisfied. The assumption we focus on is the normality of the response variable, which is directly related to the assumption of normality of the error component. In a variety of fields in business, such as finance, marketing, information systems, operations, and healthcare, the selected dependent variable does not inherently have a normal distribution. In the regression context, where the model parameters and independent variables are assumed to be constant, the distribution of the random error component thus influences the distribution of the dependent variable. Here, we study the impact of symmetric and asymmetric error distributions on the performance of the estimated model parameters. To create robust estimates, through a process of trimming the response variable, we study the effectiveness of the trimmed estimators with respect to the ordinary least squares estimator (OLS) via a simulation procedure. Accordingly, to minimise the ratio of the mean squared error of the trimmed estimator to that of the OLS, a recommendation is developed for the optimal trimming proportion.