Attrition is a common and potentially important threat to internal validity in treatment effect studies. We extend the changes-in-changes approach to identify the average treatment effect for respondents and the entire study population in the presence of attrition. Our method, which exploits baseline outcome data, can be applied to randomized experiments as well as quasi-experimental difference-in-difference designs. A formal comparison highlights that while widely used corrections typically impose restrictions on whether or how response depends on treatment, our proposed attrition correction exploits restrictions on the outcome model. We further show that the conditions required for our correction can accommodate a broad class of response models that depend on treatment in an arbitrary way. We illustrate the implementation of the proposed corrections in an application to a large-scale randomized experiment.
{"title":"Correcting attrition bias using changes-in-changes","authors":"Dalia Ghanem , Sarojini Hirshleifer , Désiré Kédagni , Karen Ortiz-Becerra","doi":"10.1016/j.jeconom.2024.105737","DOIUrl":"https://doi.org/10.1016/j.jeconom.2024.105737","url":null,"abstract":"<div><p>Attrition is a common and potentially important threat to internal validity in treatment effect studies. We extend the changes-in-changes approach to identify the average treatment effect for respondents and the entire study population in the presence of attrition. Our method, which exploits baseline outcome data, can be applied to randomized experiments as well as quasi-experimental difference-in-difference designs. A formal comparison highlights that while widely used corrections typically impose restrictions on whether or how response depends on treatment, our proposed attrition correction exploits restrictions on the outcome model. We further show that the conditions required for our correction can accommodate a broad class of response models that depend on treatment in an arbitrary way. We illustrate the implementation of the proposed corrections in an application to a large-scale randomized experiment.</p></div>","PeriodicalId":15629,"journal":{"name":"Journal of Econometrics","volume":"241 2","pages":"Article 105737"},"PeriodicalIF":6.3,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140818395","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"经济学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-03-28DOI: 10.1016/j.jeconom.2024.105727
Wenjie Wang , Yichong Zhang
We study the wild bootstrap inference for instrumental variable regressions under an alternative asymptotic framework that the number of independent clusters is fixed, the size of each cluster diverges to infinity, and the within cluster dependence is sufficiently weak. We first show that the wild bootstrap Wald test controls size asymptotically up to a small error as long as the parameters of endogenous variables are strongly identified in at least one of the clusters. Second, we establish the conditions for the bootstrap tests to have power against local alternatives. We further develop a wild bootstrap Anderson–Rubin test for the full-vector inference and show that it controls size asymptotically even under weak identification in all clusters. We illustrate their good performance using simulations and provide an empirical application to a well-known dataset about US local labor markets.
{"title":"Wild bootstrap inference for instrumental variables regressions with weak and few clusters","authors":"Wenjie Wang , Yichong Zhang","doi":"10.1016/j.jeconom.2024.105727","DOIUrl":"https://doi.org/10.1016/j.jeconom.2024.105727","url":null,"abstract":"<div><p>We study the wild bootstrap inference for instrumental variable regressions under an alternative asymptotic framework that the number of independent clusters is fixed, the size of each cluster diverges to infinity, and the within cluster dependence is sufficiently weak. We first show that the wild bootstrap Wald test controls size asymptotically up to a small error as long as the parameters of endogenous variables are strongly identified in at least one of the clusters. Second, we establish the conditions for the bootstrap tests to have power against local alternatives. We further develop a wild bootstrap Anderson–Rubin test for the full-vector inference and show that it controls size asymptotically even under weak identification in all clusters. We illustrate their good performance using simulations and provide an empirical application to a well-known dataset about US local labor markets.</p></div>","PeriodicalId":15629,"journal":{"name":"Journal of Econometrics","volume":"241 1","pages":"Article 105727"},"PeriodicalIF":6.3,"publicationDate":"2024-03-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140321564","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"经济学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-03-15DOI: 10.1016/j.jeconom.2024.105709
Lu Yu , Jiaying Gu , Stanislav Volgushev
Consider a panel data setting where repeated observations on individuals are available. Often it is reasonable to assume that there exist groups of individuals that share similar effects of observed characteristics, but the grouping is typically unknown in advance. We first conduct a local analysis which reveals that the variances of the individual coefficient estimates contain useful information for the estimation of group structure. We then propose a method to estimate unobserved groupings for general panel data models that explicitly accounts for the variance information. Our proposed method remains computationally feasible with a large number of individuals and/or repeated measurements on each individual. The developed ideas can also be applied even when individual-level data are not available and only parameter estimates together with some quantification of estimation uncertainty are given to the researcher. A thorough simulation study demonstrates superior performance of our method than existing methods and we apply the method to two empirical applications.
{"title":"Spectral clustering with variance information for group structure estimation in panel data","authors":"Lu Yu , Jiaying Gu , Stanislav Volgushev","doi":"10.1016/j.jeconom.2024.105709","DOIUrl":"https://doi.org/10.1016/j.jeconom.2024.105709","url":null,"abstract":"<div><p>Consider a panel data setting where repeated observations on individuals are available. Often it is reasonable to assume that there exist groups of individuals that share similar effects of observed characteristics, but the grouping is typically unknown in advance. We first conduct a local analysis which reveals that the variances of the individual coefficient estimates contain useful information for the estimation of group structure. We then propose a method to estimate unobserved groupings for general panel data models that explicitly accounts for the variance information. Our proposed method remains computationally feasible with a large number of individuals and/or repeated measurements on each individual. The developed ideas can also be applied even when individual-level data are not available and only parameter estimates together with some quantification of estimation uncertainty are given to the researcher. A thorough simulation study demonstrates superior performance of our method than existing methods and we apply the method to two empirical applications.</p></div>","PeriodicalId":15629,"journal":{"name":"Journal of Econometrics","volume":"241 1","pages":"Article 105709"},"PeriodicalIF":6.3,"publicationDate":"2024-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140138578","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"经济学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Testing normality against discrete normal mixtures is complex because some parameters turn increasingly underidentified along alternative ways of approaching the null, others are inequality constrained, and several higher-order derivatives become identically 0. These problems make the maximum of the alternative model log-likelihood function numerically unreliable. We propose score-type tests asymptotically equivalent to the likelihood ratio as the largest of two simple intuitive statistics that only require estimation under the null. One novelty of our approach is that we treat symmetrically both ways of writing the null hypothesis without excluding any region of the parameter space. We derive the asymptotic distribution of our tests under the null and sequences of local alternatives. We also show that their asymptotic distribution is the same whether applied to observations or standardized residuals from heteroskedastic regression models. Finally, we study their power in simulations and apply them to the residuals of Mincer earnings functions.
{"title":"Score-type tests for normal mixtures","authors":"Dante Amengual, Xinyue Bei, Marine Carrasco, Enrique Sentana","doi":"10.1016/j.jeconom.2024.105717","DOIUrl":"https://doi.org/10.1016/j.jeconom.2024.105717","url":null,"abstract":"Testing normality against discrete normal mixtures is complex because some parameters turn increasingly underidentified along alternative ways of approaching the null, others are inequality constrained, and several higher-order derivatives become identically 0. These problems make the maximum of the alternative model log-likelihood function numerically unreliable. We propose score-type tests asymptotically equivalent to the likelihood ratio as the largest of two simple intuitive statistics that only require estimation under the null. One novelty of our approach is that we treat symmetrically both ways of writing the null hypothesis without excluding any region of the parameter space. We derive the asymptotic distribution of our tests under the null and sequences of local alternatives. We also show that their asymptotic distribution is the same whether applied to observations or standardized residuals from heteroskedastic regression models. Finally, we study their power in simulations and apply them to the residuals of Mincer earnings functions.","PeriodicalId":15629,"journal":{"name":"Journal of Econometrics","volume":"148 1","pages":""},"PeriodicalIF":6.3,"publicationDate":"2024-03-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140154183","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"经济学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-03-11DOI: 10.1016/j.jeconom.2024.105723
Prosper Dovonon, Yves F. Atchadé, Firmin Doko Tchatoka
Moment condition models with mixed identification strength are models that are point identified but with estimating moment functions that are allowed to drift to 0 uniformly over the parameter space. Even though identification fails in the limit, depending on how slow the moment functions vanish, consistent estimation is possible. Existing estimators such as the generalized method of moment (GMM) estimator exhibit a pattern of nonstandard or even heterogeneous rate of convergence that materializes by some parameter directions being estimated at a slower rate than others. This paper derives asymptotic semiparametric efficiency bounds for regular estimators of parameters of these models. We show that GMM estimators are regular and that the so-called two-step GMM estimator – using the inverse of estimating function’s variance as weighting matrix – is semiparametrically efficient as it reaches the minimum variance attainable by regular estimators. This estimator is also asymptotically minimax efficient with respect to a large family of loss functions. Monte Carlo simulations are provided that confirm these results.
{"title":"Efficiency bounds for moment condition models with mixed identification strength","authors":"Prosper Dovonon, Yves F. Atchadé, Firmin Doko Tchatoka","doi":"10.1016/j.jeconom.2024.105723","DOIUrl":"https://doi.org/10.1016/j.jeconom.2024.105723","url":null,"abstract":"Moment condition models with mixed identification strength are models that are point identified but with estimating moment functions that are allowed to drift to 0 uniformly over the parameter space. Even though identification fails in the limit, depending on how slow the moment functions vanish, consistent estimation is possible. Existing estimators such as the generalized method of moment (GMM) estimator exhibit a pattern of nonstandard or even heterogeneous rate of convergence that materializes by some parameter directions being estimated at a slower rate than others. This paper derives asymptotic semiparametric efficiency bounds for regular estimators of parameters of these models. We show that GMM estimators are regular and that the so-called two-step GMM estimator – using the inverse of estimating function’s variance as weighting matrix – is semiparametrically efficient as it reaches the minimum variance attainable by regular estimators. This estimator is also asymptotically minimax efficient with respect to a large family of loss functions. Monte Carlo simulations are provided that confirm these results.","PeriodicalId":15629,"journal":{"name":"Journal of Econometrics","volume":"59 1","pages":""},"PeriodicalIF":6.3,"publicationDate":"2024-03-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140154621","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"经济学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-03-11DOI: 10.1016/j.jeconom.2024.105716
Valentina Corradi , Jack Fosten , Daniel Gutknecht
This paper provides novel tests for comparing out-of-sample predictive ability of two or more competing models that are possibly overlapping. The tests do not require pre-testing, they allow for dynamic misspecification and are valid under different estimation schemes and loss functions. In pairwise model comparisons, the test is constructed by adding a random perturbation to both the numerator and denominator of a standard Diebold–Mariano test statistic. This prevents degeneracy in the presence of overlapping models but becomes asymptotically negligible otherwise. The test is shown to control the Type I error probability asymptotically at the nominal level, uniformly over all null data generating processes. A similar idea is used to develop a superior predictive ability test for the comparison of multiple models against a benchmark. Monte Carlo simulations demonstrate that our tests exhibit very good size control in finite samples reducing both over- and under-rejection relative to its competitors. Finally, an application to forecasting U.S. excess bond returns provides evidence in favour of models using macroeconomic factors.
本文提供了新颖的检验方法,用于比较两个或多个可能重叠的竞争模型的样本外预测能力。这些检验不需要预先测试,允许动态错误规范,并在不同的估计方案和损失函数下有效。在成对模型比较中,检验方法是在标准 Diebold-Mariano 检验统计量的分子和分母中加入随机扰动。这可以防止在存在重叠模型时出现退化,但在其他情况下会变得渐近可忽略不计。结果表明,该检验能在名义水平上渐进地控制 I 类错误概率,并在所有空数据生成过程中保持一致。类似的想法还被用于开发一种优越的预测能力检验,用于将多个模型与一个基准进行比较。蒙特卡罗模拟证明,我们的检验在有限样本中表现出非常好的规模控制能力,与竞争对手相比,减少了过高和过低的拒绝率。最后,在预测美国超额债券收益方面的应用为使用宏观经济因素的模型提供了有利证据。
{"title":"Predictive ability tests with possibly overlapping models","authors":"Valentina Corradi , Jack Fosten , Daniel Gutknecht","doi":"10.1016/j.jeconom.2024.105716","DOIUrl":"https://doi.org/10.1016/j.jeconom.2024.105716","url":null,"abstract":"<div><p>This paper provides novel tests for comparing out-of-sample predictive ability of two or more competing models that are possibly overlapping. The tests do not require pre-testing, they allow for dynamic misspecification and are valid under different estimation schemes and loss functions. In pairwise model comparisons, the test is constructed by adding a random perturbation to both the numerator and denominator of a standard Diebold–Mariano test statistic. This prevents degeneracy in the presence of overlapping models but becomes asymptotically negligible otherwise. The test is shown to control the Type I error probability asymptotically at the nominal level, uniformly over all null data generating processes. A similar idea is used to develop a superior predictive ability test for the comparison of multiple models against a benchmark. Monte Carlo simulations demonstrate that our tests exhibit very good size control in finite samples reducing both over- and under-rejection relative to its competitors. Finally, an application to forecasting U.S. excess bond returns provides evidence in favour of models using macroeconomic factors.</p></div>","PeriodicalId":15629,"journal":{"name":"Journal of Econometrics","volume":"241 1","pages":"Article 105716"},"PeriodicalIF":6.3,"publicationDate":"2024-03-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140096299","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"经济学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-03-11DOI: 10.1016/j.jeconom.2024.105724
Anqi Zhao , Peng Ding
Randomized experiments balance all covariates on average and are considered the gold standard for estimating treatment effects. Chance imbalances are nonetheless common in realized treatment allocations. To inform readers of the comparability of treatment groups at baseline, contemporary scientific publications often report covariate balance tables with not only covariate means by treatment group but also the associated -values from significance tests of their differences. The practical need to avoid small -values as indicators of poor balance motivates balance check and rerandomization based on these -values from covariate balance tests (ReP) as an attractive tool for improving covariate balance in designing randomized experiments. Despite the intuitiveness of such strategy and its possibly already widespread use in practice, the literature lacks results about its implications on subsequent inference, subjecting many effectively rerandomized experiments to possibly inefficient analyses. To fill this gap, we examine a variety of potentially useful schemes for ReP and quantify their impact on subsequent inference. Specifically, we focus on three estimators of the average treatment effect from the unadjusted, additive, and interacted linear regressions of the outcome on treatment, respectively, and derive their asymptotic sampling properties under ReP. The main findings are threefold. First, the estimator from the interacted regression is asymptotically the most efficient under all ReP schemes examined, and permits convenient regression-assisted inference identical to that under complete randomization. Second, ReP, in contrast to complete randomization, improves the asymptotic efficiency of the estimators from the unadjusted and additive regressions. Standard regression analyses are accordingly still valid but in general overconservative. Third, ReP reduces the asymptotic conditional biases of the three estimators and improves their coherence in terms of mean squared difference. These results establish ReP as a convenient tool for improving covariate balance in designing randomized experiments, and we recommend using the interacted regression for analyzing data from ReP designs.
随机实验平均平衡了所有协变量,被认为是估计治疗效果的黄金标准。然而,偶然的不平衡在已实现的治疗分配中很常见。为了让读者了解基线治疗组的可比性,当代科学出版物通常会报告协变量平衡表,其中不仅包括各治疗组的协变量平均值,还包括对其差异进行显著性检验后得出的相关 p 值。由于实际需要避免将小的 p 值作为平衡性差的指标,因此在设计随机实验时,根据这些协变量平衡性检验得出的 p 值进行平衡性检查和重新随机化(ReP)是改善协变量平衡性的一种有吸引力的工具。尽管这种策略很直观,而且可能已经在实践中广泛使用,但文献中缺乏有关其对后续推断影响的结果,这使得许多有效的重新随机化实验可能受到低效分析的影响。为了填补这一空白,我们研究了各种可能有用的 ReP 方案,并量化了它们对后续推断的影响。具体来说,我们重点研究了分别来自未调整、加法和交互线性回归的平均治疗效果的三个估计值,并推导出它们在 ReP 条件下的渐近抽样特性。主要发现有三个方面。首先,在所有研究的 ReP 方案下,交互回归的估计值在渐近上都是最有效的,并且可以方便地进行与完全随机化下相同的回归辅助推断。其次,与完全随机化相比,ReP 提高了未调整回归和加法回归估计值的渐近效率。因此,标准回归分析仍然有效,但总体上过于保守。第三,ReP 减少了三个估计值的渐近条件偏差,并提高了它们在均方差方面的一致性。这些结果证明,ReP 是改进随机试验设计中协变量平衡的便捷工具,我们建议使用交互回归分析 ReP 设计的数据。
{"title":"No star is good news: A unified look at rerandomization based on p-values from covariate balance tests","authors":"Anqi Zhao , Peng Ding","doi":"10.1016/j.jeconom.2024.105724","DOIUrl":"https://doi.org/10.1016/j.jeconom.2024.105724","url":null,"abstract":"<div><p>Randomized experiments balance all covariates on average and are considered the gold standard for estimating treatment effects. Chance imbalances are nonetheless common in realized treatment allocations. To inform readers of the comparability of treatment groups at baseline, contemporary scientific publications often report covariate balance tables with not only covariate means by treatment group but also the associated <span><math><mi>p</mi></math></span>-values from significance tests of their differences. The practical need to avoid small <span><math><mi>p</mi></math></span>-values as indicators of poor balance motivates balance check and rerandomization based on these <span><math><mi>p</mi></math></span>-values from covariate balance tests (ReP) as an attractive tool for improving covariate balance in designing randomized experiments. Despite the intuitiveness of such strategy and its possibly already widespread use in practice, the literature lacks results about its implications on subsequent inference, subjecting many effectively rerandomized experiments to possibly inefficient analyses. To fill this gap, we examine a variety of potentially useful schemes for ReP and quantify their impact on subsequent inference. Specifically, we focus on three estimators of the average treatment effect from the unadjusted, additive, and interacted linear regressions of the outcome on treatment, respectively, and derive their asymptotic sampling properties under ReP. The main findings are threefold. First, the estimator from the interacted regression is asymptotically the most efficient under all ReP schemes examined, and permits convenient regression-assisted inference identical to that under complete randomization. Second, ReP, in contrast to complete randomization, improves the asymptotic efficiency of the estimators from the unadjusted and additive regressions. Standard regression analyses are accordingly still valid but in general overconservative. Third, ReP reduces the asymptotic conditional biases of the three estimators and improves their coherence in terms of mean squared difference. These results establish ReP as a convenient tool for improving covariate balance in designing randomized experiments, and we recommend using the interacted regression for analyzing data from ReP designs.</p></div>","PeriodicalId":15629,"journal":{"name":"Journal of Econometrics","volume":"241 1","pages":"Article 105724"},"PeriodicalIF":6.3,"publicationDate":"2024-03-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140096300","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"经济学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-03-07DOI: 10.1016/j.jeconom.2024.105725
Drew Creal , Jaeho Kim
We develop a flexible Bayesian model for cluster covariance matrices in large dimensions where the number of clusters and the assignment of cross-sectional units to a cluster are a-priori unknown and estimated from the data. In a cluster covariance matrix, the variances and covariances are equal within each diagonal block, while the covariances are equal in each off-diagonal block. This reduces the number of parameters by pooling those parameters together that are in the same cluster. In order to treat the number of clusters and the cluster assignments as unknowns, we build a random partition model which assigns a prior distribution over the space of partitions of the data into clusters. Sampling from the posterior over the space of partitions creates a flexible estimator because it averages across a wide set of cluster covariance matrices. We illustrate our methods on linear factor models and large vector autoregressions.
{"title":"Bayesian estimation of cluster covariance matrices of unknown form","authors":"Drew Creal , Jaeho Kim","doi":"10.1016/j.jeconom.2024.105725","DOIUrl":"https://doi.org/10.1016/j.jeconom.2024.105725","url":null,"abstract":"<div><p>We develop a flexible Bayesian model for cluster covariance matrices in large dimensions where the number of clusters and the assignment of cross-sectional units to a cluster are a-priori unknown and estimated from the data. In a cluster covariance matrix, the variances and covariances are equal within each diagonal block, while the covariances are equal in each off-diagonal block. This reduces the number of parameters by pooling those parameters together that are in the same cluster. In order to treat the number of clusters and the cluster assignments as unknowns, we build a random partition model which assigns a prior distribution over the space of partitions of the data into clusters. Sampling from the posterior over the space of partitions creates a flexible estimator because it averages across a wide set of cluster covariance matrices. We illustrate our methods on linear factor models and large vector autoregressions.</p></div>","PeriodicalId":15629,"journal":{"name":"Journal of Econometrics","volume":"241 1","pages":"Article 105725"},"PeriodicalIF":6.3,"publicationDate":"2024-03-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140052404","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"经济学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-03-01DOI: 10.1016/j.jeconom.2022.12.012
Joshua Angrist , Michal Kolesár
We revisit the finite-sample behavior of single-variable just-identified instrumental variables (just-ID IV) estimators, arguing that in most microeconometric applications, the usual inference strategies are likely reliable. Three widely-cited applications are used to explain why this is so. We then consider pretesting strategies of the form , where is the first-stage -statistic, and the first-stage sign is given. Although pervasive in empirical practice, pretesting on the first-stage -statistic exacerbates bias and distorts inference. We show, however, that median bias is both minimized and roughly halved by setting , that is by screening on the sign of the estimated first stage. This bias reduction is a free lunch: conventional confidence interval coverage is unchanged by screening on the estimated first-stage sign. To the extent that IV analysts sign-screen already, these results strengthen the case for a sanguine view of the finite-sample behavior of just-ID IV.
我们重新审视了单变量公正识别工具变量(公正-ID IV)估计器的有限样本行为,认为在大多数微观计量经济学应用中,通常的推断策略可能是可靠的。我们用三个被广泛引用的应用来解释为什么会这样。然后,我们考虑了 t1>c 形式的预检验策略,其中 t1 是第一阶段的 t 统计量,第一阶段的符号是给定的。尽管在实证实践中普遍存在,但对第一阶段 F 统计量的预检验会加剧偏差并扭曲推断。然而,我们的研究表明,通过设置 c=0,即对估计的第一阶段符号进行筛选,中位偏差可以最小化,并大致减半。这种偏差的减少是免费的午餐:通过对估计的第一阶段符号进行筛选,传统的置信区间覆盖率保持不变。如果 IV 分析师已经对符号进行了筛选,那么这些结果就更能说明我们应该乐观地看待公正 ID IV 的有限样本行为。
{"title":"One instrument to rule them all: The bias and coverage of just-ID IV","authors":"Joshua Angrist , Michal Kolesár","doi":"10.1016/j.jeconom.2022.12.012","DOIUrl":"10.1016/j.jeconom.2022.12.012","url":null,"abstract":"<div><p><span>We revisit the finite-sample behavior of single-variable just-identified instrumental variables<span> (just-ID IV) estimators, arguing that in most microeconometric applications, the usual inference strategies are likely reliable. Three widely-cited applications are used to explain why this is so. We then consider pretesting strategies of the form </span></span><span><math><mrow><msub><mrow><mi>t</mi></mrow><mrow><mn>1</mn></mrow></msub><mo>></mo><mi>c</mi></mrow></math></span>, where <span><math><msub><mrow><mi>t</mi></mrow><mrow><mn>1</mn></mrow></msub></math></span> is the first-stage <span><math><mi>t</mi></math></span>-statistic, and the first-stage sign is given. Although pervasive in empirical practice, pretesting on the first-stage <span><math><mi>F</mi></math></span>-statistic exacerbates bias and distorts inference. We show, however, that median bias is both minimized and roughly halved by setting <span><math><mrow><mi>c</mi><mo>=</mo><mn>0</mn></mrow></math></span>, that is by screening on the sign of the <em>estimated</em><span> first stage. This bias reduction is a free lunch: conventional confidence interval coverage is unchanged by screening on the estimated first-stage sign. To the extent that IV analysts sign-screen already, these results strengthen the case for a sanguine view of the finite-sample behavior of just-ID IV.</span></p></div>","PeriodicalId":15629,"journal":{"name":"Journal of Econometrics","volume":"240 2","pages":"Article 105398"},"PeriodicalIF":6.3,"publicationDate":"2024-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135753445","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"经济学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}