{"title":"On outliers detection and prior distribution sensitivity in standard skew-probit regression models","authors":"Fabiano Rodrigues Coelho, C. Russo, J. Bazán","doi":"10.1214/22-bjps534","DOIUrl":null,"url":null,"abstract":"Regression models with probit and logit link functions are the most frequently used for binary response variables. However, traditional approaches may not be adequate when data are unbalanced. This paper deals with standard skew-probit regression models. Parameters were estimated through a new Bayesian approach which consists of the use of Hamiltonian Monte Carlo (HMC) and the original likelihood function. Simulation studies assessed the efficiency of the estimation method and the sensitivity of prior distributions for parameters related to asymmetry calculating the RMSE (root mean square error). The proposed estimation method was compared when used for detecting outliers. The results show that the proposed method is more efficient than INLA and is successful in the recovery of true parameter values. The sensitivity study enabled the proposal of a new prior distribution configuration for the asymmetry parameter, and the randomized quantile residual proved to be more suitable for detecting outliers. The methodology was applied to a diabetes dataset towards illustrating the results.","PeriodicalId":51242,"journal":{"name":"Brazilian Journal of Probability and Statistics","volume":" ","pages":""},"PeriodicalIF":0.6000,"publicationDate":"2022-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Brazilian Journal of Probability and Statistics","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1214/22-bjps534","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"STATISTICS & PROBABILITY","Score":null,"Total":0}
引用次数: 2
Abstract
Regression models with probit and logit link functions are the most frequently used for binary response variables. However, traditional approaches may not be adequate when data are unbalanced. This paper deals with standard skew-probit regression models. Parameters were estimated through a new Bayesian approach which consists of the use of Hamiltonian Monte Carlo (HMC) and the original likelihood function. Simulation studies assessed the efficiency of the estimation method and the sensitivity of prior distributions for parameters related to asymmetry calculating the RMSE (root mean square error). The proposed estimation method was compared when used for detecting outliers. The results show that the proposed method is more efficient than INLA and is successful in the recovery of true parameter values. The sensitivity study enabled the proposal of a new prior distribution configuration for the asymmetry parameter, and the randomized quantile residual proved to be more suitable for detecting outliers. The methodology was applied to a diabetes dataset towards illustrating the results.
期刊介绍:
The Brazilian Journal of Probability and Statistics aims to publish high quality research papers in applied probability, applied statistics, computational statistics, mathematical statistics, probability theory and stochastic processes.
More specifically, the following types of contributions will be considered:
(i) Original articles dealing with methodological developments, comparison of competing techniques or their computational aspects.
(ii) Original articles developing theoretical results.
(iii) Articles that contain novel applications of existing methodologies to practical problems. For these papers the focus is in the importance and originality of the applied problem, as well as, applications of the best available methodologies to solve it.
(iv) Survey articles containing a thorough coverage of topics of broad interest to probability and statistics. The journal will occasionally publish book reviews, invited papers and essays on the teaching of statistics.