We provide various norm-based definitions of different types of cross-sectional dependence and the relations between them. These definitions facilitate to comprehend and to characterize the various forms of cross-sectional dependence, such as strong, semi-strong, and weak dependence. Then we examine the asymptotic properties of parameter estimators both for fixed (within) effect estimator and random effect (pooled) estimator for linear panel data models incorporating various forms of cross-sectional dependence. The asymptotic properties are also derived when both cross-sectional and temporal dependence are present. Subsequently, we develop consistent and robust standard error of the parameter estimators both for fixed effect and random effect model separately. Robust standard errors are developed (i) for pure cross-sectional dependence; and (ii) also for cross-sectional and time series dependence. Under strong or semi-strong cross-sectional dependence, it is established that when the time dependence comes through the idiosyncratic errors, such time dependence does not have any influence in the asymptotic variance of $(hat{beta}_{FE/RE}). $ Hence, it is argued that in estimating $Var(hat{beta}_{FE/RE}),$ Newey-West kind of correction injects bias in the variance estimate. Furthermore, this article lay down conditions under which $t$, $F$ and the $Wald$ statistics based on the robust covariance matrix estimator give valid inference.
{"title":"Understanding Cross-Sectional Dependence in Panel Data","authors":"G. Basak, Samarjit Das","doi":"10.2139/ssrn.3167337","DOIUrl":"https://doi.org/10.2139/ssrn.3167337","url":null,"abstract":"We provide various norm-based definitions of different types of cross-sectional dependence and the relations between them. These definitions facilitate to comprehend and to characterize the various forms of cross-sectional dependence, such as strong, semi-strong, and weak dependence. Then we examine the asymptotic properties of parameter estimators both for fixed (within) effect estimator and random effect (pooled) estimator for linear panel data models incorporating various forms of cross-sectional dependence. The asymptotic properties are also derived when both cross-sectional and temporal dependence are present. Subsequently, we develop consistent and robust standard error of the parameter estimators both for fixed effect and random effect model separately. Robust standard errors are developed (i) for pure cross-sectional dependence; and (ii) also for cross-sectional and time series dependence. Under strong or semi-strong cross-sectional dependence, it is established that when the time dependence comes through the idiosyncratic errors, such time dependence does not have any influence in the asymptotic variance of $(hat{beta}_{FE/RE}). $ Hence, it is argued that in estimating $Var(hat{beta}_{FE/RE}),$ Newey-West kind of correction injects bias in the variance estimate. Furthermore, this article lay down conditions under which $t$, $F$ and the $Wald$ statistics based on the robust covariance matrix estimator give valid inference.","PeriodicalId":320844,"journal":{"name":"PSN: Econometrics","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121693417","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In a recent contribution to the financial econometrics literature, Chu et al. (2017) provide the first examination of the time-series price behaviour of the most popular cryptocurrencies. However, insufficient attention was paid to correctly diagnosing the distribution of GARCH innovations. When these data issues are controlled for, their results lack robustness and may lead to either underestimation or overestimation of future risks. The main aim of this paper therefore is to provide an improved econometric specification. Particular attention is paid to correctly diagnosing the distribution of GARCH innovations by means of Kolmogorov type non-parametric tests and Khmaladze's martingale transformation. Numerical computation is carried out by implementing a Gauss-Kronrod quadrature. Parameters of GARCH models are estimated using maximum likelihood. For calculating P-values, the parametric bootstrap method is used. Further reference is made to the merits and demerits of statistical techniques presented in the related and recently published literature.
{"title":"Conditional Heteroskedasticity in Crypto-Asset Returns","authors":"Charles Shaw","doi":"10.2139/ssrn.3094024","DOIUrl":"https://doi.org/10.2139/ssrn.3094024","url":null,"abstract":"In a recent contribution to the financial econometrics literature, Chu et al. (2017) provide the first examination of the time-series price behaviour of the most popular cryptocurrencies. However, insufficient attention was paid to correctly diagnosing the distribution of GARCH innovations. When these data issues are controlled for, their results lack robustness and may lead to either underestimation or overestimation of future risks. The main aim of this paper therefore is to provide an improved econometric specification. Particular attention is paid to correctly diagnosing the distribution of GARCH innovations by means of Kolmogorov type non-parametric tests and Khmaladze's martingale transformation. Numerical computation is carried out by implementing a Gauss-Kronrod quadrature. Parameters of GARCH models are estimated using maximum likelihood. For calculating P-values, the parametric bootstrap method is used. Further reference is made to the merits and demerits of statistical techniques presented in the related and recently published literature.","PeriodicalId":320844,"journal":{"name":"PSN: Econometrics","volume":"101 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116183772","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract We consider the regression discontinuity (RD) design with the duration outcome which has discrete support. The parameters of policy interest are treatment effects on unconditional (duration effect) and conditional (hazard effect) exiting probabilities for each discrete level. We propose a novel semi-nonparametric estimator which exploits a flexible separability structure of the underlying continuous-time duration process. Simultaneous inference over discrete levels is nonstandard since the asymptotic variance matrix is singular with unknown rank. The peculiarity is delivered by the nature of the RD estimand, and we provide solutions. Random censoring and competing risks can also be allowed in our framework.
{"title":"A Semi-Nonparametric Estimator of Regression Discontinuity Design with Discrete Duration Outcomes","authors":"Ke-Li Xu","doi":"10.2139/ssrn.3095673","DOIUrl":"https://doi.org/10.2139/ssrn.3095673","url":null,"abstract":"Abstract We consider the regression discontinuity (RD) design with the duration outcome which has discrete support. The parameters of policy interest are treatment effects on unconditional (duration effect) and conditional (hazard effect) exiting probabilities for each discrete level. We propose a novel semi-nonparametric estimator which exploits a flexible separability structure of the underlying continuous-time duration process. Simultaneous inference over discrete levels is nonstandard since the asymptotic variance matrix is singular with unknown rank. The peculiarity is delivered by the nature of the RD estimand, and we provide solutions. Random censoring and competing risks can also be allowed in our framework.","PeriodicalId":320844,"journal":{"name":"PSN: Econometrics","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130575589","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The Two-Stage Least squares method for obtaining the estimated structural coefficients of a simultaneous linear equations model is a celebrated method that uses OLS at the first stage for estimating the reduced form coefficients and obtaining the expected values in the arrays of current exogenous variables. At the second stage it uses OLS, equation by equation, in which the explanatory expected current endogenous variables are used as instruments representing their observed counterpart. It has been pointed out that since the explanatory expected current endogenous variables are linear functions of the predetermined variables in the model, inclusion of such expected current endogenous variables together with a subset of predetermined variables as regressors make the estimation procedure susceptible to the deleterious effects of collinearity, which may render some of the estimated structural coefficients with inflated variance as well as wrong sign. As a remedy to this problem, the use of Shapley value regression at the second stage has been proposed. For illustration a model has been constructed in which the measures of the different aspects of globalization are the endogenous variables while the measures of the different aspects of democracy are the predetermined variables. It has been found that the conventional (OLS-based) Two-Stage Least Squares (2-SLS) gives some of the estimated structural coefficients with an unexpected sign. In contrast, all structural coefficients estimated with the proposed 2-SLS (in which Shapley value regression has been used at the second stage) have an expected sign. These empirical findings suggest that the measures of globalization are conformal among themselves as well as they are positively affected by democratic regimes.
{"title":"A New Kind of Two-Stage Least Squares Based on Shapley Value Regression","authors":"Sudhanshu K. Mishra","doi":"10.2139/ssrn.3094512","DOIUrl":"https://doi.org/10.2139/ssrn.3094512","url":null,"abstract":"The Two-Stage Least squares method for obtaining the estimated structural coefficients of a simultaneous linear equations model is a celebrated method that uses OLS at the first stage for estimating the reduced form coefficients and obtaining the expected values in the arrays of current exogenous variables. At the second stage it uses OLS, equation by equation, in which the explanatory expected current endogenous variables are used as instruments representing their observed counterpart. It has been pointed out that since the explanatory expected current endogenous variables are linear functions of the predetermined variables in the model, inclusion of such expected current endogenous variables together with a subset of predetermined variables as regressors make the estimation procedure susceptible to the deleterious effects of collinearity, which may render some of the estimated structural coefficients with inflated variance as well as wrong sign. As a remedy to this problem, the use of Shapley value regression at the second stage has been proposed. For illustration a model has been constructed in which the measures of the different aspects of globalization are the endogenous variables while the measures of the different aspects of democracy are the predetermined variables. It has been found that the conventional (OLS-based) Two-Stage Least Squares (2-SLS) gives some of the estimated structural coefficients with an unexpected sign. In contrast, all structural coefficients estimated with the proposed 2-SLS (in which Shapley value regression has been used at the second stage) have an expected sign. These empirical findings suggest that the measures of globalization are conformal among themselves as well as they are positively affected by democratic regimes.","PeriodicalId":320844,"journal":{"name":"PSN: Econometrics","volume":"29 24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130505017","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We present a general framework for studying regularized estimators; i.e., estimation problems wherein "plug-in" type estimators are either ill-defined or ill-behaved. We derive primitive conditions that imply consistency and asymptotic linear representation for regularized estimators, allowing for slower than $sqrt{n}$ estimators as well as infinite dimensional parameters. We also provide data-driven methods for choosing tuning parameters that, under some conditions, achieve the aforementioned results. We illustrate the scope of our approach by studying a wide range of applications, revisiting known results and deriving new ones.
{"title":"Some Large Sample Results for the Method of Regularized Estimators","authors":"Michael Jansson, Demian Pouzo","doi":"10.2139/SSRN.3090731","DOIUrl":"https://doi.org/10.2139/SSRN.3090731","url":null,"abstract":"We present a general framework for studying regularized estimators; i.e., estimation problems wherein \"plug-in\" type estimators are either ill-defined or ill-behaved. We derive primitive conditions that imply consistency and asymptotic linear representation for regularized estimators, allowing for slower than $sqrt{n}$ estimators as well as infinite dimensional parameters. We also provide data-driven methods for choosing tuning parameters that, under some conditions, achieve the aforementioned results. We illustrate the scope of our approach by studying a wide range of applications, revisiting known results and deriving new ones.","PeriodicalId":320844,"journal":{"name":"PSN: Econometrics","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124519262","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We propose a new variational Bayes method for estimating high-dimensional copulas with discrete, or discrete and continuous, margins. The method is based on a variational approximation to a tractable augmented posterior, and is substantially faster than previous likelihood-based approaches. We use it to estimate drawable vine copulas for univariate and multivariate Markov ordinal and mixed time series. These have dimension $rT$, where $T$ is the number of observations and $r$ is the number of series, and are difficult to estimate using previous methods. The vine pair-copulas are carefully selected to allow for heteroskedasticity, which is a common feature of ordinal time series data. When combined with flexible margins, the resulting time series models also allow for other common features of ordinal data, such as zero inflation, multiple modes and under- or over-dispersion. Using data on homicides in New South Wales, and also U.S bankruptcies, we illustrate both the flexibility of the time series copula models, and the efficacy of the variational Bayes estimator for copulas of up to 792 dimensions and 60 parameters. This far exceeds the size and complexity of copula models for discrete data that can be estimated using previous methods.
{"title":"Variational Bayes Estimation of Time Series Copulas for Multivariate Ordinal and Mixed Data","authors":"Rubén Albeiro Loaiza Maya, M. Smith","doi":"10.2139/ssrn.3093123","DOIUrl":"https://doi.org/10.2139/ssrn.3093123","url":null,"abstract":"We propose a new variational Bayes method for estimating high-dimensional copulas with discrete, or discrete and continuous, margins. The method is based on a variational approximation to a tractable augmented posterior, and is substantially faster than previous likelihood-based approaches. We use it to estimate drawable vine copulas for univariate and multivariate Markov ordinal and mixed time series. These have dimension $rT$, where $T$ is the number of observations and $r$ is the number of series, and are difficult to estimate using previous methods. The vine pair-copulas are carefully selected to allow for heteroskedasticity, which is a common feature of ordinal time series data. When combined with flexible margins, the resulting time series models also allow for other common features of ordinal data, such as zero inflation, multiple modes and under- or over-dispersion. Using data on homicides in New South Wales, and also U.S bankruptcies, we illustrate both the flexibility of the time series copula models, and the efficacy of the variational Bayes estimator for copulas of up to 792 dimensions and 60 parameters. This far exceeds the size and complexity of copula models for discrete data that can be estimated using previous methods.","PeriodicalId":320844,"journal":{"name":"PSN: Econometrics","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114975542","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
When there is exact collinearity between regressors, their individual coefficients are not identified, but given an informative prior their Bayesian posterior means are well defined. The case of high but not exact collinearity is more complicated but similar results follow. Just as exact collinearity causes non-identification of the parameters, high collinearity can be viewed as weak identification of the parameters, which we represent, in line with the weak instrument literature, by the correlation matrix being of full rank for a finite sample size T, but converging to a rank defficient matrix as T goes to infinity. This paper examines the asymptotic behavior of the posterior mean and precision of the parameters of a linear regression model for both the cases of exactly and highly collinear regressors. We show that in both cases the posterior mean remains sensitive to the choice of prior means even if the sample size is sufficiently large, and that the precision rises at a slower rate than the sample size. In the highly collinear case, the posterior means converge to normally distributed random variables whose mean and variance depend on the priors for coefficients and precision. The distribution degenerates to fixed points for either exact collinearity or strong identification. The analysis also suggests a diagnostic statistic for the highly collinear case, which is illustrated with an empirical example.
{"title":"Posterior Means and Precisions of the Coefficients in Linear Models with Highly Collinear Regressors","authors":"M. Pesaran, Ron P. Smith","doi":"10.2139/ssrn.3076052","DOIUrl":"https://doi.org/10.2139/ssrn.3076052","url":null,"abstract":"When there is exact collinearity between regressors, their individual coefficients are not identified, but given an informative prior their Bayesian posterior means are well defined. The case of high but not exact collinearity is more complicated but similar results follow. Just as exact collinearity causes non-identification of the parameters, high collinearity can be viewed as weak identification of the parameters, which we represent, in line with the weak instrument literature, by the correlation matrix being of full rank for a finite sample size T, but converging to a rank defficient matrix as T goes to infinity. This paper examines the asymptotic behavior of the posterior mean and precision of the parameters of a linear regression model for both the cases of exactly and highly collinear regressors. We show that in both cases the posterior mean remains sensitive to the choice of prior means even if the sample size is sufficiently large, and that the precision rises at a slower rate than the sample size. In the highly collinear case, the posterior means converge to normally distributed random variables whose mean and variance depend on the priors for coefficients and precision. The distribution degenerates to fixed points for either exact collinearity or strong identification. The analysis also suggests a diagnostic statistic for the highly collinear case, which is illustrated with an empirical example.","PeriodicalId":320844,"journal":{"name":"PSN: Econometrics","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116250259","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper develops a method to efficiently estimate hidden Markov models with continuous latent variables using maximum likelihood estimation. To evaluate the (marginal) likelihood function, I decompose the integral over the unobserved state variables into a series of lower dimensional integrals, and recursively approximate them using numerical quadrature and interpolation. I show that this procedure has very favorable numerical properties: First, the computational complexity grows linearly in time, which makes the integration over hundreds and thousands of periods well feasible. Second, I prove that the numerical error is accumulated sub-linearly over time; consequently, using highly efficient and fast converging numerical quadrature and interpolation methods for low and medium dimensions, such as Gaussian quadrature and Chebyshev polynomials, the numerical error can be well controlled even for very large numbers of periods. Lastly, I show that the numerical convergence rates of the quadrature and interpolation methods are preserved up to a factor of at least 0.5 under appropriate assumptions.I apply this method to the bus engine replacement model of Rust: first, I verify the algorithm’s ability to recover the parameters in an extensive Monte Carlo study with simulated datasets; second, I estimate the model using the original dataset.
{"title":"Divide and Conquer: Recursive Likelihood Function Integration for Hidden Markov Models with Continuous Latent Variables","authors":"Gregor Reich","doi":"10.2139/ssrn.2794884","DOIUrl":"https://doi.org/10.2139/ssrn.2794884","url":null,"abstract":"This paper develops a method to efficiently estimate hidden Markov models with continuous latent variables using maximum likelihood estimation. To evaluate the (marginal) likelihood function, I decompose the integral over the unobserved state variables into a series of lower dimensional integrals, and recursively approximate them using numerical quadrature and interpolation. I show that this procedure has very favorable numerical properties: First, the computational complexity grows linearly in time, which makes the integration over hundreds and thousands of periods well feasible. Second, I prove that the numerical error is accumulated sub-linearly over time; consequently, using highly efficient and fast converging numerical quadrature and interpolation methods for low and medium dimensions, such as Gaussian quadrature and Chebyshev polynomials, the numerical error can be well controlled even for very large numbers of periods. Lastly, I show that the numerical convergence rates of the quadrature and interpolation methods are preserved up to a factor of at least 0.5 under appropriate assumptions.I apply this method to the bus engine replacement model of Rust: first, I verify the algorithm’s ability to recover the parameters in an extensive Monte Carlo study with simulated datasets; second, I estimate the model using the original dataset.","PeriodicalId":320844,"journal":{"name":"PSN: Econometrics","volume":"145 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134371809","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The National Establishment Time Series (NETS) is a private sector source of U.S. business microdata. Researchers have used state-specific NETS extracts for many years, but relatively little is known about the accuracy and representativeness of the nationwide NETS sample. We explore the properties of NETS as compared to official U.S. data on business activity: The Census Bureau's County Business Patterns (CBP) and Nonemployer Statistics (NES) and the Bureau of Labor Statistics' Quarterly Census of Employment and Wages (QCEW). We find that the NETS universe does not cover the entirety of the Census-based employer and nonemployer universes, but given certain restrictions NETS can be made to mimic official employer datasets with reasonable precision. The largest differences between NETS employer data and official sources are among small establishments, where imputation is prevalent in NETS. The most stringent of our proposed sample restrictions still allows scope that cover s about three quarters of U.S. private sector employment. We conclude that NETS microdata can be useful and convenient for studying static business activity in high detail.
{"title":"An Assessment of the National Establishment Time Series (Nets) Database","authors":"Keith Barnatchez, Leland D. Crane, Ryan A. Decker","doi":"10.17016/FEDS.2017.110","DOIUrl":"https://doi.org/10.17016/FEDS.2017.110","url":null,"abstract":"The National Establishment Time Series (NETS) is a private sector source of U.S. business microdata. Researchers have used state-specific NETS extracts for many years, but relatively little is known about the accuracy and representativeness of the nationwide NETS sample. We explore the properties of NETS as compared to official U.S. data on business activity: The Census Bureau's County Business Patterns (CBP) and Nonemployer Statistics (NES) and the Bureau of Labor Statistics' Quarterly Census of Employment and Wages (QCEW). We find that the NETS universe does not cover the entirety of the Census-based employer and nonemployer universes, but given certain restrictions NETS can be made to mimic official employer datasets with reasonable precision. The largest differences between NETS employer data and official sources are among small establishments, where imputation is prevalent in NETS. The most stringent of our proposed sample restrictions still allows scope that cover s about three quarters of U.S. private sector employment. We conclude that NETS microdata can be useful and convenient for studying static business activity in high detail.","PeriodicalId":320844,"journal":{"name":"PSN: Econometrics","volume":"130 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114682342","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We propose a flexible and robust nonparametric local logit regression for modelling and predicting defaulted loans' recovery rates that lie in [0,1]. Applying the model to the widely studied Moody's recovery dataset and estimating it by a data-driven method, the local logit regression uncovers the underlying nonlinear relationship between the recovery and covariates, which include loan/borrower characteristics and economic conditions. We find some significant nonlinear marginal and interaction effects of conditioning variables on recoveries of defaulted loans. The presence of such nonlinear economic effects enriches the local logit model specification that supports the improved recovery prediction. This paper is the first to study a nonparametric regression model that not only generates unbiased and improved recovery predictions of defaulted loans relative to the parametric counterpart, it also facilitates reliable inference on marginal and interaction effects of loan/borrower characteristics and economic conditions. Moreover, incorporating these nonlinear marginal and interaction effects, we improve the specification of parametric regression for fractional response variable, which we call "calibrated" model, the predictive performance of which is comparable to that of local logit model. This calibrated parametric model will be attractive to applied researchers and industry professionals working in the risk management area and unfamiliar with nonparametric machinery.
{"title":"Local Logit Regression for Recovery Rate","authors":"Nithi Sopitpongstorn, P. Silvapulle, Jiti Gao","doi":"10.2139/ssrn.3053774","DOIUrl":"https://doi.org/10.2139/ssrn.3053774","url":null,"abstract":"We propose a flexible and robust nonparametric local logit regression for modelling and predicting defaulted loans' recovery rates that lie in [0,1]. Applying the model to the widely studied Moody's recovery dataset and estimating it by a data-driven method, the local logit regression uncovers the underlying nonlinear relationship between the recovery and covariates, which include loan/borrower characteristics and economic conditions. We find some significant nonlinear marginal and interaction effects of conditioning variables on recoveries of defaulted loans. The presence of such nonlinear economic effects enriches the local logit model specification that supports the improved recovery prediction. This paper is the first to study a nonparametric regression model that not only generates unbiased and improved recovery predictions of defaulted loans relative to the parametric counterpart, it also facilitates reliable inference on marginal and interaction effects of loan/borrower characteristics and economic conditions. Moreover, incorporating these nonlinear marginal and interaction effects, we improve the specification of parametric regression for fractional response variable, which we call \"calibrated\" model, the predictive performance of which is comparable to that of local logit model. This calibrated parametric model will be attractive to applied researchers and industry professionals working in the risk management area and unfamiliar with nonparametric machinery.","PeriodicalId":320844,"journal":{"name":"PSN: Econometrics","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128916626","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}