Pub Date : 2020-05-04DOI: 10.1920/wp.cem.2020.1920
S. Lee, S. Jun
We investigate identification of causal parameters in case-control and related studies. The odds ratio in the sample is our main estimand of interest and we articulate its relationship with causal parameters under various scenarios. It turns out that the odds ratio is generally a sharp upper bound for counterfactual relative risk under some monotonicity assumptions, without resorting to strong ignorability, nor to the rare-disease assumption. Further, we propose semparametrically efficient, easy-to-implement, machine-learning-friendly estimators of the aggregated (log) odds ratio by exploiting an explicit form of the efficient influence function. Using our new estimators, we develop methods for causal inference and illustrate the usefulness of our methods by a real-data example.
{"title":"Causal inference in case-control studies","authors":"S. Lee, S. Jun","doi":"10.1920/wp.cem.2020.1920","DOIUrl":"https://doi.org/10.1920/wp.cem.2020.1920","url":null,"abstract":"We investigate identification of causal parameters in case-control and related studies. The odds ratio in the sample is our main estimand of interest and we articulate its relationship with causal parameters under various scenarios. It turns out that the odds ratio is generally a sharp upper bound for counterfactual relative risk under some monotonicity assumptions, without resorting to strong ignorability, nor to the rare-disease assumption. Further, we propose semparametrically efficient, easy-to-implement, machine-learning-friendly estimators of the aggregated (log) odds ratio by exploiting an explicit form of the efficient influence function. Using our new estimators, we develop methods for causal inference and illustrate the usefulness of our methods by a real-data example.","PeriodicalId":8448,"journal":{"name":"arXiv: Econometrics","volume":"99 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2020-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84629436","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-04-27DOI: 10.25932/PUBLISHUP-51191
Julian Hinz, Amrei Stammann, Joschka Wanner
We study the role and drivers of persistence in the extensive margin of bilateral trade. Motivated by a stylized heterogeneous firms model of international trade with market entry costs, we consider dynamic three-way fixed effects binary choice models and study the corresponding incidental parameter problem. The standard maximum likelihood estimator is consistent under asymptotics where all panel dimensions grow at a constant rate, but it has an asymptotic bias in its limiting distribution, invalidating inference even in situations where the bias appears to be small. Thus, we propose two different bias-corrected estimators. Monte Carlo simulations confirm their desirable statistical properties. We apply these estimators in a reassessment of the most commonly studied determinants of the extensive margin of trade. Both true state dependence and unobserved heterogeneity contribute considerably to trade persistence and taking this persistence into account matters significantly in identifying the effects of trade policies on the extensive margin.
{"title":"State Dependence and Unobserved Heterogeneity in the Extensive Margin of Trade","authors":"Julian Hinz, Amrei Stammann, Joschka Wanner","doi":"10.25932/PUBLISHUP-51191","DOIUrl":"https://doi.org/10.25932/PUBLISHUP-51191","url":null,"abstract":"We study the role and drivers of persistence in the extensive margin of bilateral trade. Motivated by a stylized heterogeneous firms model of international trade with market entry costs, we consider dynamic three-way fixed effects binary choice models and study the corresponding incidental parameter problem. The standard maximum likelihood estimator is consistent under asymptotics where all panel dimensions grow at a constant rate, but it has an asymptotic bias in its limiting distribution, invalidating inference even in situations where the bias appears to be small. Thus, we propose two different bias-corrected estimators. Monte Carlo simulations confirm their desirable statistical properties. We apply these estimators in a reassessment of the most commonly studied determinants of the extensive margin of trade. Both true state dependence and unobserved heterogeneity contribute considerably to trade persistence and taking this persistence into account matters significantly in identifying the effects of trade policies on the extensive margin.","PeriodicalId":8448,"journal":{"name":"arXiv: Econometrics","volume":"88 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2020-04-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74436866","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-04-24DOI: 10.1920/wp.cem.2020.1520
Francesca Molinari
This chapter reviews the microeconometrics literature on partial identification, focusing on the developments of the last thirty years. The topics presented illustrate that the available data combined with credible maintained assumptions may yield much information about a parameter of interest, even if they do not reveal it exactly. Special attention is devoted to discussing the challenges associated with, and some of the solutions put forward to, (1) obtain a tractable characterization of the values for the parameters of interest which are observationally equivalent, given the available data and maintained assumptions; (2) estimate this set of values; (3) conduct test of hypotheses and make confidence statements. The chapter reviews advances in partial identification analysis both as applied to learning (functionals of) probability distributions that are well-defined in the absence of models, as well as to learning parameters that are well-defined only in the context of particular models. A simple organizing principle is highlighted: the source of the identification problem can often be traced to a collection of random variables that are consistent with the available data and maintained assumptions. This collection may be part of the observed data or be a model implication. In either case, it can be formalized as a random set. Random set theory is then used as a mathematical framework to unify a number of special results and produce a general methodology to carry out partial identification analysis.
{"title":"Microeconometrics with Partial Identification","authors":"Francesca Molinari","doi":"10.1920/wp.cem.2020.1520","DOIUrl":"https://doi.org/10.1920/wp.cem.2020.1520","url":null,"abstract":"This chapter reviews the microeconometrics literature on partial identification, focusing on the developments of the last thirty years. The topics presented illustrate that the available data combined with credible maintained assumptions may yield much information about a parameter of interest, even if they do not reveal it exactly. Special attention is devoted to discussing the challenges associated with, and some of the solutions put forward to, (1) obtain a tractable characterization of the values for the parameters of interest which are observationally equivalent, given the available data and maintained assumptions; (2) estimate this set of values; (3) conduct test of hypotheses and make confidence statements. The chapter reviews advances in partial identification analysis both as applied to learning (functionals of) probability distributions that are well-defined in the absence of models, as well as to learning parameters that are well-defined only in the context of particular models. A simple organizing principle is highlighted: the source of the identification problem can often be traced to a collection of random variables that are consistent with the available data and maintained assumptions. This collection may be part of the observed data or be a model implication. In either case, it can be formalized as a random set. Random set theory is then used as a mathematical framework to unify a number of special results and produce a general methodology to carry out partial identification analysis.","PeriodicalId":8448,"journal":{"name":"arXiv: Econometrics","volume":"130 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2020-04-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73762834","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper proposes a new nonparametric mixed data sampling (MIDAS) model and develops a framework to infer clusters in a panel dataset of mixed sampling frequencies. The nonparametric MIDAS estimation method is more flexible but substantially less costly to estimate than existing approaches. The proposed clustering algorithm successfully recovers true membership in the cross-section both in theory and in simulations without requiring prior knowledge such as the number of clusters. This methodology is applied to estimate a mixed-frequency Okun's law model for the state-level data in the U.S. and uncovers four clusters based on the dynamic features of labor markets.
{"title":"Revealing Cluster Structures Based on Mixed Sampling Frequencies","authors":"Yeonwoo Rho, Yun Liu, Hie Joo Ahn","doi":"10.17016/FEDS.2020.082","DOIUrl":"https://doi.org/10.17016/FEDS.2020.082","url":null,"abstract":"This paper proposes a new nonparametric mixed data sampling (MIDAS) model and develops a framework to infer clusters in a panel dataset of mixed sampling frequencies. The nonparametric MIDAS estimation method is more flexible but substantially less costly to estimate than existing approaches. The proposed clustering algorithm successfully recovers true membership in the cross-section both in theory and in simulations without requiring prior knowledge such as the number of clusters. This methodology is applied to estimate a mixed-frequency Okun's law model for the state-level data in the U.S. and uncovers four clusters based on the dynamic features of labor markets.","PeriodicalId":8448,"journal":{"name":"arXiv: Econometrics","volume":"19 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2020-04-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78428783","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-11-18DOI: 10.1920/wp.cem.2019.6119
Iván Fernández-Val, Franco Peracchi, F. Vella, A. Vuuren
We analyze the sources of changes in the distribution of hourly wages in the United States using CPS data for the survey years 1976 to 2016. We account for the selection bias from the employment decision by modeling the distribution of annual hours of work and estimating a nonseparable model of wages which uses a control function to account for selection. This allows the inclusion of all individuals working positive hours and thus provides a fuller description of the wage distribution. We decompose changes in the distribution of wages into composition, structural and selection effects. Composition effects have increased wages at all quantiles but the patterns of change are generally determined by the structural effects. Evidence of changes in the selection effects only appear at the lower quantiles of the female wage distribution. These various components combine to produce a substantial increase in wage inequality.
{"title":"Decomposing Changes in the Distribution of Real Hourly Wages in the U.S.","authors":"Iván Fernández-Val, Franco Peracchi, F. Vella, A. Vuuren","doi":"10.1920/wp.cem.2019.6119","DOIUrl":"https://doi.org/10.1920/wp.cem.2019.6119","url":null,"abstract":"We analyze the sources of changes in the distribution of hourly wages in the United States using CPS data for the survey years 1976 to 2016. We account for the selection bias from the employment decision by modeling the distribution of annual hours of work and estimating a nonseparable model of wages which uses a control function to account for selection. This allows the inclusion of all individuals working positive hours and thus provides a fuller description of the wage distribution. We decompose changes in the distribution of wages into composition, structural and selection effects. Composition effects have increased wages at all quantiles but the patterns of change are generally determined by the structural effects. Evidence of changes in the selection effects only appear at the lower quantiles of the female wage distribution. These various components combine to produce a substantial increase in wage inequality.","PeriodicalId":8448,"journal":{"name":"arXiv: Econometrics","volume":"15 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86370239","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-10-21DOI: 10.1920/wp.cem.2019.5419
K. Colangelo, Ying-Ying Lee
We propose a nonparametric inference method for causal effects of continuous treatment variables, under unconfoundedness and in the presence of high-dimensional or nonparametric nuisance parameters. Our double debiased machine learning (DML) estimators for the average dose-response function (or the average structural function) and the partial effects are asymptotically normal with nonparametric convergence rates. The nuisance estimators for the conditional expectation function and the conditional density can be nonparametric kernel or series estimators or ML methods. Using a kernel-based doubly robust influence function and cross-fitting, we give tractable primitive conditions under which the nuisance estimators do not affect the first-order large sample distribution of the DML estimators. We justify the use of kernel to localize the continuous treatment at a given value by the Gateaux derivative. We implement various ML methods in Monte Carlo simulations and an empirical application on a job training program evaluation.
{"title":"Double debiased machine learning nonparametric inference with continuous treatments","authors":"K. Colangelo, Ying-Ying Lee","doi":"10.1920/wp.cem.2019.5419","DOIUrl":"https://doi.org/10.1920/wp.cem.2019.5419","url":null,"abstract":"We propose a nonparametric inference method for causal effects of continuous treatment variables, under unconfoundedness and in the presence of high-dimensional or nonparametric nuisance parameters. Our double debiased machine learning (DML) estimators for the average dose-response function (or the average structural function) and the partial effects are asymptotically normal with nonparametric convergence rates. The nuisance estimators for the conditional expectation function and the conditional density can be nonparametric kernel or series estimators or ML methods. Using a kernel-based doubly robust influence function and cross-fitting, we give tractable primitive conditions under which the nuisance estimators do not affect the first-order large sample distribution of the DML estimators. We justify the use of kernel to localize the continuous treatment at a given value by the Gateaux derivative. We implement various ML methods in Monte Carlo simulations and an empirical application on a job training program evaluation.","PeriodicalId":8448,"journal":{"name":"arXiv: Econometrics","volume":"130 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79798991","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-10-10DOI: 10.1920/WP.CEM.2019.6819
Hiroaki Kaido, Yi Zhang
This study develops a framework for testing hypotheses on structural parameters in incomplete models. Such models make set-valued predictions and hence do not generally yield a unique likelihood function. The model structure, however, allows us to construct tests based on the least favorable pairs of likelihoods using the theory of Huber and Strassen (1973). We develop tests robust to model incompleteness that possess certain optimality properties. We also show that sharp identifying restrictions play a role in constructing such tests in a computationally tractable manner. A framework for analyzing the local asymptotic power of the tests is developed by embedding the least favorable pairs into a model that allows local approximations under the limits of experiments argument. Examples of the hypotheses we consider include those on the presence of strategic interaction effects in discrete games of complete information. Monte Carlo experiments demonstrate the robust performance of the proposed tests.
{"title":"Robust likelihood ratio tests for incomplete economic models","authors":"Hiroaki Kaido, Yi Zhang","doi":"10.1920/WP.CEM.2019.6819","DOIUrl":"https://doi.org/10.1920/WP.CEM.2019.6819","url":null,"abstract":"This study develops a framework for testing hypotheses on structural parameters in incomplete models. Such models make set-valued predictions and hence do not generally yield a unique likelihood function. The model structure, however, allows us to construct tests based on the least favorable pairs of likelihoods using the theory of Huber and Strassen (1973). We develop tests robust to model incompleteness that possess certain optimality properties. We also show that sharp identifying restrictions play a role in constructing such tests in a computationally tractable manner. A framework for analyzing the local asymptotic power of the tests is developed by embedding the least favorable pairs into a model that allows local approximations under the limits of experiments argument. Examples of the hypotheses we consider include those on the presence of strategic interaction effects in discrete games of complete information. Monte Carlo experiments demonstrate the robust performance of the proposed tests.","PeriodicalId":8448,"journal":{"name":"arXiv: Econometrics","volume":"19 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84868043","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We study the incidental parameter problem in "three-way" Poisson Pseudo-Maximum Likelihood ("PPML") gravity models recently recommended for identifying the effects of trade policies and in other network panel data settings. Despite the number and variety of fixed effects this model entails, we confirm it is consistent for small $T$ and we show it is in fact the only estimator among a wide range of PML gravity estimators that is generally consistent in this context when $T$ is small. At the same time, asymptotic confidence intervals in fixed-$T$ panels are not correctly centered at the true point estimates, and cluster-robust variance estimates used to construct standard errors are generally biased as well. We characterize each of these biases analytically and show both numerically and empirically that they are salient even for real-data settings with a large number of countries. We also offer practical remedies that can be used to obtain more reliable inferences of the effects of trade policies and other time-varying gravity variables.
{"title":"Bias and consistency in three-way gravity models","authors":"Thomas Zylkin, M. Weidner","doi":"10.1920/WP.CEM.2020.120","DOIUrl":"https://doi.org/10.1920/WP.CEM.2020.120","url":null,"abstract":"We study the incidental parameter problem in \"three-way\" Poisson Pseudo-Maximum Likelihood (\"PPML\") gravity models recently recommended for identifying the effects of trade policies and in other network panel data settings. Despite the number and variety of fixed effects this model entails, we confirm it is consistent for small $T$ and we show it is in fact the only estimator among a wide range of PML gravity estimators that is generally consistent in this context when $T$ is small. At the same time, asymptotic confidence intervals in fixed-$T$ panels are not correctly centered at the true point estimates, and cluster-robust variance estimates used to construct standard errors are generally biased as well. We characterize each of these biases analytically and show both numerically and empirically that they are salient even for real-data settings with a large number of countries. We also offer practical remedies that can be used to obtain more reliable inferences of the effects of trade policies and other time-varying gravity variables.","PeriodicalId":8448,"journal":{"name":"arXiv: Econometrics","volume":"229 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77301755","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Language and cultural diversity is a fundamental aspect of the present world. We study three modern multilingual societies -- the Basque Country, Ireland and Wales -- which are endowed with two, linguistically distant, official languages: $A$, spoken by all individuals, and $B$, spoken by a bilingual minority. In the three cases it is observed a decay in the use of minoritarian $B$, a sign of diversity loss. However, for the "Council of Europe" the key factor to avoid the shift of $B$ is its use in all domains. Thus, we investigate the language choices of the bilinguals by means of an evolutionary game theoretic model. We show that the language population dynamics has reached an evolutionary stable equilibrium where a fraction of bilinguals have shifted to speak $A$. Thus, this equilibrium captures the decline in the use of $B$. To test the theory we build empirical models that predict the use of $B$ for each proportion of bilinguals. We show that model-based predictions fit very well the observed use of Basque, Irish, and Welsh.
{"title":"The economics of minority language use: theory and empirical evidence for a language game model","authors":"S. Sperlich, J. Uriarte","doi":"10.2139/ssrn.3445345","DOIUrl":"https://doi.org/10.2139/ssrn.3445345","url":null,"abstract":"Language and cultural diversity is a fundamental aspect of the present world. We study three modern multilingual societies -- the Basque Country, Ireland and Wales -- which are endowed with two, linguistically distant, official languages: $A$, spoken by all individuals, and $B$, spoken by a bilingual minority. In the three cases it is observed a decay in the use of minoritarian $B$, a sign of diversity loss. However, for the \"Council of Europe\" the key factor to avoid the shift of $B$ is its use in all domains. Thus, we investigate the language choices of the bilinguals by means of an evolutionary game theoretic model. We show that the language population dynamics has reached an evolutionary stable equilibrium where a fraction of bilinguals have shifted to speak $A$. Thus, this equilibrium captures the decline in the use of $B$. To test the theory we build empirical models that predict the use of $B$ for each proportion of bilinguals. We show that model-based predictions fit very well the observed use of Basque, Irish, and Welsh.","PeriodicalId":8448,"journal":{"name":"arXiv: Econometrics","volume":"103 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83870194","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}