Pub Date : 2024-05-01DOI: 10.1016/j.jeconom.2024.105794
Alessandro Casini , Pierre Perron
We introduce a nonparametric nonlinear VAR prewhitened long-run variance (LRV) estimator for the construction of standard errors robust to autocorrelation and heteroskedasticity that can be used for hypothesis testing in a variety of contexts including the linear regression model. Existing methods either are theoretically valid only under stationarity and have poor finite-sample properties under nonstationarity (i.e., fixed- methods), or are theoretically valid under the null hypothesis but lead to tests that are not consistent under nonstationary alternative hypothesis (i.e., both fixed- and traditional HAC estimators). The proposed estimator accounts explicitly for nonstationarity, unlike previous prewhitened procedures which are known to be unreliable, and leads to tests with accurate null rejection rates and good monotonic power. We also establish MSE bounds for LRV estimation that are sharper than previously established and use them to determine the data-dependent bandwidths.
我们引入了一种非参数非线性 VAR 预白化长期方差(LRV)估计器,用于构建对自相关性和异方差性稳健的标准误差,该估计器可用于包括线性回归模型在内的多种情况下的假设检验。现有的方法要么只在静态条件下理论上有效,而在非静态条件下有限样本特性较差(即固定-b 方法),要么在零假设条件下理论上有效,但在非静态替代假设条件下导致检验不一致(即固定-b 和传统 HAC 估计器)。与之前已知不可靠的预白化程序不同,我们所提出的估计器明确考虑了非平稳性,并导致检验具有准确的空拒绝率和良好的单调性。我们还为 LRV 估计建立了比以前更清晰的 MSE 边界,并利用它们来确定与数据相关的带宽。
{"title":"Prewhitened long-run variance estimation robust to nonstationarity","authors":"Alessandro Casini , Pierre Perron","doi":"10.1016/j.jeconom.2024.105794","DOIUrl":"https://doi.org/10.1016/j.jeconom.2024.105794","url":null,"abstract":"<div><p>We introduce a nonparametric nonlinear VAR prewhitened long-run variance (LRV) estimator for the construction of standard errors robust to autocorrelation and heteroskedasticity that can be used for hypothesis testing in a variety of contexts including the linear regression model. Existing methods either are theoretically valid only under stationarity and have poor finite-sample properties under nonstationarity (i.e., fixed-<span><math><mi>b</mi></math></span> methods), or are theoretically valid under the null hypothesis but lead to tests that are not consistent under nonstationary alternative hypothesis (i.e., both fixed-<span><math><mi>b</mi></math></span> and traditional HAC estimators). The proposed estimator accounts explicitly for nonstationarity, unlike previous prewhitened procedures which are known to be unreliable, and leads to tests with accurate null rejection rates and good monotonic power. We also establish MSE bounds for LRV estimation that are sharper than previously established and use them to determine the data-dependent bandwidths.</p></div>","PeriodicalId":15629,"journal":{"name":"Journal of Econometrics","volume":"242 1","pages":"Article 105794"},"PeriodicalIF":6.3,"publicationDate":"2024-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141292245","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"经济学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-01DOI: 10.1016/j.jeconom.2024.105785
Manudeep Bhuller , Henrik Sigstad
We study what two-stage least squares (2SLS) identifies in models with multiple treatments under treatment effect heterogeneity. Two conditions are shown to be necessary and sufficient for the 2SLS to identify positively weighted sums of agent-specific effects of each treatment: average conditional monotonicity and no cross effects. Our identification analysis allows for any number of treatments, any number of continuous or discrete instruments, and the inclusion of covariates. We provide testable implications and present characterizations of choice behavior implied by our identification conditions.
{"title":"2SLS with multiple treatments","authors":"Manudeep Bhuller , Henrik Sigstad","doi":"10.1016/j.jeconom.2024.105785","DOIUrl":"https://doi.org/10.1016/j.jeconom.2024.105785","url":null,"abstract":"<div><p>We study what two-stage least squares (2SLS) identifies in models with multiple treatments under treatment effect heterogeneity. Two conditions are shown to be necessary and sufficient for the 2SLS to identify positively weighted sums of agent-specific effects of each treatment: <em>average conditional monotonicity</em> and <em>no cross effects</em>. Our identification analysis allows for any number of treatments, any number of continuous or discrete instruments, and the inclusion of covariates. We provide testable implications and present characterizations of choice behavior implied by our identification conditions.</p></div>","PeriodicalId":15629,"journal":{"name":"Journal of Econometrics","volume":"242 1","pages":"Article 105785"},"PeriodicalIF":6.3,"publicationDate":"2024-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0304407624001313/pdfft?md5=2731cd7dd32f64f1ed1eb1837a569f7d&pid=1-s2.0-S0304407624001313-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141292121","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"经济学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-01DOI: 10.1016/j.jeconom.2024.105787
Cem Çakmaklı , Yasin Şimşek
This paper extends the canonical model of epidemiology, the SIRD model, to allow for time-varying parameters for real-time measurement and prediction of the trajectory of the Covid-19 pandemic. Time variation in model parameters is captured using the score-driven modeling structure designed for the typical daily count data related to the pandemic. The resulting specification permits a flexible yet parsimonious model with a low computational cost. The model is extended to allow for unreported cases using a mixed-frequency setting. Results suggest that these cases’ effects on the parameter estimates might be sizeable. Full sample results show that the flexible framework accurately captures the successive waves of the pandemic. A real-time exercise indicates that the proposed structure delivers timely and precise information on the pandemic’s current stance. This superior performance, in turn, transforms into accurate predictions of the death cases and cases treated in Intensive Care Units (ICUs).
{"title":"Bridging the Covid-19 data and the epidemiological model using the time-varying parameter SIRD model","authors":"Cem Çakmaklı , Yasin Şimşek","doi":"10.1016/j.jeconom.2024.105787","DOIUrl":"https://doi.org/10.1016/j.jeconom.2024.105787","url":null,"abstract":"<div><p>This paper extends the canonical model of epidemiology, the SIRD model, to allow for time-varying parameters for real-time measurement and prediction of the trajectory of the Covid-19 pandemic. Time variation in model parameters is captured using the score-driven modeling structure designed for the typical daily count data related to the pandemic. The resulting specification permits a flexible yet parsimonious model with a low computational cost. The model is extended to allow for unreported cases using a mixed-frequency setting. Results suggest that these cases’ effects on the parameter estimates might be sizeable. Full sample results show that the flexible framework accurately captures the successive waves of the pandemic. A real-time exercise indicates that the proposed structure delivers timely and precise information on the pandemic’s current stance. This superior performance, in turn, transforms into accurate predictions of the death cases and cases treated in Intensive Care Units (ICUs).</p></div>","PeriodicalId":15629,"journal":{"name":"Journal of Econometrics","volume":"242 1","pages":"Article 105787"},"PeriodicalIF":6.3,"publicationDate":"2024-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141292246","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"经济学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-01DOI: 10.1016/j.jeconom.2024.105793
Yong Cai , Ahnaf Rafi
The Neyman Allocation is used in many papers on experimental design, which typically assume that researchers have access to large pilot studies. This may be unrealistic. To understand the properties of the Neyman Allocation with small pilots, we study its behavior in an asymptotic framework that takes pilot size to be fixed even as the size of the main wave tends to infinity. Our analysis shows that the Neyman Allocation can lead to estimates of the ATE with higher asymptotic variance than with (non-adaptive) balanced randomization. In particular, this happens when the outcome variable is relatively homoskedastic with respect to treatment status or when it exhibits high kurtosis. We provide a series of empirical examples showing that such situations can arise in practice. Our results suggest that researchers with small pilots should not use the Neyman Allocation if they believe that outcomes are homoskedastic or heavy-tailed. Finally, we examine some potential methods for improving the finite sample performance of the FNA via simulations.
许多关于实验设计的论文中都使用了奈曼分配法,这些论文通常假定研究人员有机会进行大型试验研究。这可能是不现实的。为了了解奈曼分配法在小规模试验中的特性,我们在一个渐进框架中研究了它的行为,该框架认为即使主波的规模趋于无穷大,试验规模也是固定的。我们的分析表明,与(非自适应的)平衡随机化相比,奈曼分配法可能导致 ATE 的估计值具有更高的渐近方差。特别是当结果变量与治疗状态相对同方差或呈现高峰度时,这种情况就会发生。我们提供了一系列实证例子,说明在实践中可能会出现这种情况。我们的研究结果表明,如果有小规模试点的研究人员认为结果是同方差或重尾的,就不应该使用奈曼分配法。最后,我们通过模拟研究了一些改善 FNA 有限样本性能的潜在方法。
{"title":"On the performance of the Neyman Allocation with small pilots","authors":"Yong Cai , Ahnaf Rafi","doi":"10.1016/j.jeconom.2024.105793","DOIUrl":"https://doi.org/10.1016/j.jeconom.2024.105793","url":null,"abstract":"<div><p>The Neyman Allocation is used in many papers on experimental design, which typically assume that researchers have access to large pilot studies. This may be unrealistic. To understand the properties of the Neyman Allocation with small pilots, we study its behavior in an asymptotic framework that takes pilot size to be fixed even as the size of the main wave tends to infinity. Our analysis shows that the Neyman Allocation can lead to estimates of the ATE with higher asymptotic variance than with (non-adaptive) balanced randomization. In particular, this happens when the outcome variable is relatively homoskedastic with respect to treatment status or when it exhibits high kurtosis. We provide a series of empirical examples showing that such situations can arise in practice. Our results suggest that researchers with small pilots should not use the Neyman Allocation if they believe that outcomes are homoskedastic or heavy-tailed. Finally, we examine some potential methods for improving the finite sample performance of the FNA via simulations.</p></div>","PeriodicalId":15629,"journal":{"name":"Journal of Econometrics","volume":"242 1","pages":"Article 105793"},"PeriodicalIF":6.3,"publicationDate":"2024-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141292230","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"经济学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-01DOI: 10.1016/j.jeconom.2024.105784
Haitian Xie
Many empirical examples of regression discontinuity (RD) designs concern a continuous treatment variable, but the theoretical aspects of such models are less studied. This study examines the identification and estimation of the structural function in fuzzy RD designs with a continuous treatment variable. The structural function fully describes the causal impact of the treatment on the outcome. We show that the nonlinear and nonseparable structural function can be nonparametrically identified at the RD cutoff under shape restrictions, including monotonicity and smoothness conditions. Based on the nonparametric identification equation, we propose a three-step semiparametric estimation procedure and establish the asymptotic normality of the estimator. The semiparametric estimator achieves the same convergence rate as in the case of a binary treatment variable. As an application of the method, we estimate the causal effect of sleep time on health status by using the discontinuity in natural light timing at time zone boundaries.
{"title":"Nonlinear and nonseparable structural functions in regression discontinuity designs with a continuous treatment","authors":"Haitian Xie","doi":"10.1016/j.jeconom.2024.105784","DOIUrl":"https://doi.org/10.1016/j.jeconom.2024.105784","url":null,"abstract":"<div><p>Many empirical examples of regression discontinuity (RD) designs concern a continuous treatment variable, but the theoretical aspects of such models are less studied. This study examines the identification and estimation of the structural function in fuzzy RD designs with a continuous treatment variable. The structural function fully describes the causal impact of the treatment on the outcome. We show that the nonlinear and nonseparable structural function can be nonparametrically identified at the RD cutoff under shape restrictions, including monotonicity and smoothness conditions. Based on the nonparametric identification equation, we propose a three-step semiparametric estimation procedure and establish the asymptotic normality of the estimator. The semiparametric estimator achieves the same convergence rate as in the case of a binary treatment variable. As an application of the method, we estimate the causal effect of sleep time on health status by using the discontinuity in natural light timing at time zone boundaries.</p></div>","PeriodicalId":15629,"journal":{"name":"Journal of Econometrics","volume":"242 1","pages":"Article 105784"},"PeriodicalIF":6.3,"publicationDate":"2024-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141244886","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"经济学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-01DOI: 10.1016/j.jeconom.2024.105735
Leonard Goff
When a researcher combines multiple instrumental variables for a single binary treatment, the monotonicity assumption of the local average treatment effects (LATE) framework can become restrictive: it requires that all units share a common direction of response even when separate instruments are shifted in opposing directions. What I call vector monotonicity, by contrast, simply assumes treatment uptake to be monotonic in all instruments. I characterize the class of causal parameters that are point identified under vector monotonicity, when the instruments are binary. This class includes, for example, the average treatment effect among units that are in any way responsive to the collection of instruments, or those that are responsive to a given subset of them. The identification results are constructive and yield a simple estimator for the identified treatment effect parameters. An empirical application revisits the labor market returns to college.
{"title":"A vector monotonicity assumption for multiple instruments","authors":"Leonard Goff","doi":"10.1016/j.jeconom.2024.105735","DOIUrl":"https://doi.org/10.1016/j.jeconom.2024.105735","url":null,"abstract":"<div><p>When a researcher combines multiple instrumental variables for a single binary treatment, the monotonicity assumption of the local average treatment effects (LATE) framework can become restrictive: it requires that all units share a common direction of response even when separate instruments are shifted in opposing directions. What I call <em>vector monotonicity</em>, by contrast, simply assumes treatment uptake to be monotonic in all instruments. I characterize the class of causal parameters that are point identified under vector monotonicity, when the instruments are binary. This class includes, for example, the average treatment effect among units that are in any way responsive to the collection of instruments, or those that are responsive to a given subset of them. The identification results are constructive and yield a simple estimator for the identified treatment effect parameters. An empirical application revisits the labor market returns to college.</p></div>","PeriodicalId":15629,"journal":{"name":"Journal of Econometrics","volume":"241 1","pages":"Article 105735"},"PeriodicalIF":6.3,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140618471","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"经济学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-01DOI: 10.1016/j.jeconom.2024.105742
Jacob Schwartz , Kyungchul Song
In many empirical studies of a large two-sided matching market (such as in a college admissions problem), the researcher performs statistical inference under the assumption that they observe a random sample from a large matching market. In this paper, we consider a setting in which the researcher observes either all or a nontrivial fraction of outcomes from a stable matching. We establish a concentration inequality for empirical matching probabilities assuming strong correlation among the colleges’ preferences while allowing students’ preferences to be fully heterogeneous. Our concentration inequality yields laws of large numbers for the empirical matching probabilities and other statistics commonly used in empirical analyses of a large matching market. To illustrate the usefulness of our concentration inequality, we prove consistency for estimators of conditional matching probabilities and measures of positive assortative matching.
{"title":"The law of large numbers for large stable matchings","authors":"Jacob Schwartz , Kyungchul Song","doi":"10.1016/j.jeconom.2024.105742","DOIUrl":"https://doi.org/10.1016/j.jeconom.2024.105742","url":null,"abstract":"<div><p>In many empirical studies of a large two-sided matching market (such as in a college admissions problem), the researcher performs statistical inference under the assumption that they observe a random sample from a large matching market. In this paper, we consider a setting in which the researcher observes either all or a nontrivial fraction of outcomes from a stable matching. We establish a concentration inequality for empirical matching probabilities assuming strong correlation among the colleges’ preferences while allowing students’ preferences to be fully heterogeneous. Our concentration inequality yields laws of large numbers for the empirical matching probabilities and other statistics commonly used in empirical analyses of a large matching market. To illustrate the usefulness of our concentration inequality, we prove consistency for estimators of conditional matching probabilities and measures of positive assortative matching.</p></div>","PeriodicalId":15629,"journal":{"name":"Journal of Econometrics","volume":"241 1","pages":"Article 105742"},"PeriodicalIF":6.3,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140638424","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"经济学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-01DOI: 10.1016/j.jeconom.2024.105752
Giovanni Urga , Fa Wang
This paper proposes maximum (quasi)likelihood estimation for high dimensional factor models with regime switching in the loadings. The model parameters are estimated jointly by the EM (expectation maximization) algorithm, which in the current context only requires iteratively calculating regime probabilities and principal components of the weighted sample covariance matrix. When regime dynamics are taken into account, smoothed regime probabilities are calculated using a recursive algorithm. Consistency, convergence rates and limit distributions of the estimated loadings and the estimated factors are established under weak cross-sectional and temporal dependence as well as heteroscedasticity. It is worth noting that due to high dimension, regime switching can be identified consistently after the switching point with only one observation period. Simulation results show good performance of the proposed method. An application to the FRED-MD dataset illustrates the potential of the proposed method for detection of business cycle turning points.
{"title":"Estimation and inference for high dimensional factor model with regime switching","authors":"Giovanni Urga , Fa Wang","doi":"10.1016/j.jeconom.2024.105752","DOIUrl":"https://doi.org/10.1016/j.jeconom.2024.105752","url":null,"abstract":"<div><p>This paper proposes maximum (quasi)likelihood estimation for high dimensional factor models with regime switching in the loadings. The model parameters are estimated jointly by the EM (expectation maximization) algorithm, which in the current context only requires iteratively calculating regime probabilities and principal components of the weighted sample covariance matrix. When regime dynamics are taken into account, smoothed regime probabilities are calculated using a recursive algorithm. Consistency, convergence rates and limit distributions of the estimated loadings and the estimated factors are established under weak cross-sectional and temporal dependence as well as heteroscedasticity. It is worth noting that due to high dimension, regime switching can be identified consistently after the switching point with only one observation period. Simulation results show good performance of the proposed method. An application to the FRED-MD dataset illustrates the potential of the proposed method for detection of business cycle turning points.</p></div>","PeriodicalId":15629,"journal":{"name":"Journal of Econometrics","volume":"241 2","pages":"Article 105752"},"PeriodicalIF":6.3,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140825926","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"经济学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-01DOI: 10.1016/j.jeconom.2024.105764
Anna Bykhovskaya , James A. Duffy
This paper considers highly persistent time series that are subject to nonlinearities in the form of censoring or an occasionally binding constraint, such as are regularly encountered in macroeconomics. A tractable candidate model for such series is the dynamic Tobit with a root local to unity. We show that this model generates a process that converges weakly to a non-standard limiting process, that is constrained (regulated) to be positive. Surprisingly, despite the presence of censoring, the OLS estimators of the model parameters are consistent. We show that this allows OLS-based inferences to be drawn on the overall persistence of the process (as measured by the sum of the autoregressive coefficients), and for the null of a unit root to be tested in the presence of censoring. Our simulations illustrate that the conventional ADF test substantially over-rejects when the data is generated by a dynamic Tobit with a unit root, whereas our proposed test is correctly sized. We provide an application of our methods to testing for a unit root in the Swiss franc/euro exchange rate, during a period when this was subject to an occasionally binding lower bound.
{"title":"The local to unity dynamic Tobit model","authors":"Anna Bykhovskaya , James A. Duffy","doi":"10.1016/j.jeconom.2024.105764","DOIUrl":"https://doi.org/10.1016/j.jeconom.2024.105764","url":null,"abstract":"<div><p>This paper considers highly persistent time series that are subject to nonlinearities in the form of censoring or an occasionally binding constraint, such as are regularly encountered in macroeconomics. A tractable candidate model for such series is the dynamic Tobit with a root local to unity. We show that this model generates a process that converges weakly to a non-standard limiting process, that is constrained (regulated) to be positive. Surprisingly, despite the presence of censoring, the OLS estimators of the model parameters are consistent. We show that this allows OLS-based inferences to be drawn on the overall persistence of the process (as measured by the sum of the autoregressive coefficients), and for the null of a unit root to be tested in the presence of censoring. Our simulations illustrate that the conventional ADF test substantially over-rejects when the data is generated by a dynamic Tobit with a unit root, whereas our proposed test is correctly sized. We provide an application of our methods to testing for a unit root in the Swiss franc/euro exchange rate, during a period when this was subject to an occasionally binding lower bound.</p></div>","PeriodicalId":15629,"journal":{"name":"Journal of Econometrics","volume":"241 2","pages":"Article 105764"},"PeriodicalIF":6.3,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140901880","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"经济学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-01DOI: 10.1016/j.jeconom.2024.105740
Yuehao Bai , Liang Jiang , Joseph P. Romano , Azeem M. Shaikh , Yichong Zhang
This paper studies inference for the average treatment effect (ATE) in experiments in which treatment status is determined according to “matched pairs” and it is additionally desired to adjust for observed, baseline covariates to gain further precision. By a “matched pairs” design, we mean that units are sampled i.i.d. from the population of interest, paired according to observed, baseline covariates, and finally, within each pair, one unit is selected at random for treatment. Importantly, we presume that not all observed, baseline covariates are used in determining treatment assignment. We study a broad class of estimators based on a “doubly robust” moment condition that permits us to study estimators with both finite-dimensional and high-dimensional forms of covariate adjustment. We find that estimators with finite-dimensional, linear adjustments need not lead to improvements in precision relative to the unadjusted difference-in-means estimator. This phenomenon persists even if the adjustments interact with treatment; in fact, doing so leads to no changes in precision. However, gains in precision can be ensured by including fixed effects for each of the pairs. Indeed, we show that this adjustment leads to the minimum asymptotic variance of the corresponding ATE estimator among all finite-dimensional, linear adjustments. We additionally study an estimator with a regularized adjustment, which can accommodate high-dimensional covariates. We show that this estimator leads to improvements in precision relative to the unadjusted difference-in-means estimator and also provides conditions under which it leads to the “optimal” nonparametric, covariate adjustment. A simulation study confirms the practical relevance of our theoretical analysis, and the methods are employed to reanalyze data from an experiment using a “matched pairs” design to study the effect of macroinsurance on microenterprise.
{"title":"Covariate adjustment in experiments with matched pairs","authors":"Yuehao Bai , Liang Jiang , Joseph P. Romano , Azeem M. Shaikh , Yichong Zhang","doi":"10.1016/j.jeconom.2024.105740","DOIUrl":"https://doi.org/10.1016/j.jeconom.2024.105740","url":null,"abstract":"<div><p>This paper studies inference for the average treatment effect (ATE) in experiments in which treatment status is determined according to “matched pairs” and it is additionally desired to adjust for observed, baseline covariates to gain further precision. By a “matched pairs” design, we mean that units are sampled i.i.d. from the population of interest, paired according to observed, baseline covariates, and finally, within each pair, one unit is selected at random for treatment. Importantly, we presume that not all observed, baseline covariates are used in determining treatment assignment. We study a broad class of estimators based on a “doubly robust” moment condition that permits us to study estimators with both finite-dimensional and high-dimensional forms of covariate adjustment. We find that estimators with finite-dimensional, linear adjustments need not lead to improvements in precision relative to the unadjusted difference-in-means estimator. This phenomenon persists even if the adjustments interact with treatment; in fact, doing so leads to no changes in precision. However, gains in precision can be ensured by including fixed effects for each of the pairs. Indeed, we show that this adjustment leads to the minimum asymptotic variance of the corresponding ATE estimator among all finite-dimensional, linear adjustments. We additionally study an estimator with a regularized adjustment, which can accommodate high-dimensional covariates. We show that this estimator leads to improvements in precision relative to the unadjusted difference-in-means estimator and also provides conditions under which it leads to the “optimal” nonparametric, covariate adjustment. A simulation study confirms the practical relevance of our theoretical analysis, and the methods are employed to reanalyze data from an experiment using a “matched pairs” design to study the effect of macroinsurance on microenterprise.</p></div>","PeriodicalId":15629,"journal":{"name":"Journal of Econometrics","volume":"241 1","pages":"Article 105740"},"PeriodicalIF":6.3,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140621415","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"经济学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}