Pub Date : 2026-01-01DOI: 10.1016/j.jeconom.2026.106180
Adam Baybutt , Manu Navjeevan
Plausible identification of conditional average treatment effects (CATEs) can rely on controlling for a large number of variables to account for confounding factors. In these high-dimensional settings, estimation of the CATE requires estimating first-stage models whose consistency relies on correctly specifying their parametric forms. While doubly-robust estimators of the CATE exist, inference procedures based on the second-stage CATE estimator are not doubly-robust. Using the popular augmented inverse propensity weighting signal, we propose an estimator for the CATE whose resulting Wald-type confidence intervals are doubly-robust. We assume a logistic model for the propensity score and a linear model for the outcome regression, and estimate the parameters of these models using an ℓ1 (Lasso) penalty to address the high-dimensional covariates. Inference based on this estimator remains valid even if one of the logistic propensity score or linear outcome regression models are misspecified.
{"title":"Doubly-robust inference for conditional average treatment effects with high-dimensional controls","authors":"Adam Baybutt , Manu Navjeevan","doi":"10.1016/j.jeconom.2026.106180","DOIUrl":"10.1016/j.jeconom.2026.106180","url":null,"abstract":"<div><div>Plausible identification of conditional average treatment effects (CATEs) can rely on controlling for a large number of variables to account for confounding factors. In these high-dimensional settings, estimation of the CATE requires estimating first-stage models whose consistency relies on correctly specifying their parametric forms. While doubly-robust estimators of the CATE exist, inference procedures based on the second-stage CATE estimator are not doubly-robust. Using the popular augmented inverse propensity weighting signal, we propose an estimator for the CATE whose resulting Wald-type confidence intervals are doubly-robust. We assume a logistic model for the propensity score and a linear model for the outcome regression, and estimate the parameters of these models using an ℓ<sub>1</sub> (Lasso) penalty to address the high-dimensional covariates. Inference based on this estimator remains valid even if one of the logistic propensity score or linear outcome regression models are misspecified.</div></div>","PeriodicalId":15629,"journal":{"name":"Journal of Econometrics","volume":"253 ","pages":"Article 106180"},"PeriodicalIF":4.0,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145976963","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"经济学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-01DOI: 10.1016/j.jeconom.2026.106189
Jizhou Liu
This paper studies inference in two-stage randomized experiments under covariate-adaptive randomization. In the initial stage of this experimental design, clusters (e.g., households, schools, or graph partitions) are stratified and randomly assigned to control or treatment groups based on cluster-level covariates. Subsequently, an independent second-stage design is carried out, wherein units within each treated cluster are further stratified and randomly assigned to either control or treatment groups, based on individual-level covariates. Under the homogeneous partial interference assumption, I establish conditions under which the proposed difference-in-“average of averages” estimators are consistent and asymptotically normal for the corresponding average primary and spillover effects and develop consistent estimators of their asymptotic variances. Combining these results establishes the asymptotic validity of tests based on these estimators. My findings suggest that ignoring covariate information in the design stage can result in efficiency loss, and commonly used inference methods that ignore or improperly use covariate information can lead to either conservative or invalid inference. Then, I apply these results to studying optimal use of covariate information under covariate-adaptive randomization in large samples, and demonstrate that a specific generalized matched-pair design achieves minimum asymptotic variance for each proposed estimator. Finally, I discuss covariate adjustment, which incorporates additional baseline covariates not used for treatment assignment. The practical relevance of the theoretical results is illustrated through a simulation study and an empirical application.
{"title":"Inference for two-stage experiments under covariate-adaptive randomization","authors":"Jizhou Liu","doi":"10.1016/j.jeconom.2026.106189","DOIUrl":"10.1016/j.jeconom.2026.106189","url":null,"abstract":"<div><div>This paper studies inference in two-stage randomized experiments under covariate-adaptive randomization. In the initial stage of this experimental design, clusters (e.g., households, schools, or graph partitions) are stratified and randomly assigned to control or treatment groups based on cluster-level covariates. Subsequently, an independent second-stage design is carried out, wherein units within each treated cluster are further stratified and randomly assigned to either control or treatment groups, based on individual-level covariates. Under the homogeneous partial interference assumption, I establish conditions under which the proposed difference-in-“average of averages” estimators are consistent and asymptotically normal for the corresponding average primary and spillover effects and develop consistent estimators of their asymptotic variances. Combining these results establishes the asymptotic validity of tests based on these estimators. My findings suggest that ignoring covariate information in the design stage can result in efficiency loss, and commonly used inference methods that ignore or improperly use covariate information can lead to either conservative or invalid inference. Then, I apply these results to studying optimal use of covariate information under covariate-adaptive randomization in large samples, and demonstrate that a specific generalized matched-pair design achieves minimum asymptotic variance for each proposed estimator. Finally, I discuss covariate adjustment, which incorporates additional baseline covariates not used for treatment assignment. The practical relevance of the theoretical results is illustrated through a simulation study and an empirical application.</div></div>","PeriodicalId":15629,"journal":{"name":"Journal of Econometrics","volume":"253 ","pages":"Article 106189"},"PeriodicalIF":4.0,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146034687","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"经济学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-01DOI: 10.1016/j.jeconom.2025.106174
Konrad Menzel
We derive asymptotic approximations for models of strategic network formation, where limits are taken as the number of nodes (agents) increases to infinity. Our framework assumes a random utility model where agents have heterogeneous tastes over links, and payoffs allow for anonymous and non-anonymous interaction effects, and the observed network is assumed to be pairwise stable. Our main results concern convergence of the link intensity from finite pairwise stable networks to the (many-player) limiting distribution. The set of possible limiting distributions is shown to have a fairly simple form and is characterized through aggregate equilibrium conditions, which may permit multiple solutions. We illustrate how these formal results can be used to analyze identification of link preferences and estimate or bound preference parameters. We also derive an analytical expression for agents’ welfare (expected surplus) from the structure of the network.
{"title":"Strategic network formation with many agents","authors":"Konrad Menzel","doi":"10.1016/j.jeconom.2025.106174","DOIUrl":"10.1016/j.jeconom.2025.106174","url":null,"abstract":"<div><div>We derive asymptotic approximations for models of strategic network formation, where limits are taken as the number of nodes (agents) increases to infinity. Our framework assumes a random utility model where agents have heterogeneous tastes over links, and payoffs allow for anonymous and non-anonymous interaction effects, and the observed network is assumed to be pairwise stable. Our main results concern convergence of the link intensity from finite pairwise stable networks to the (many-player) limiting distribution. The set of possible limiting distributions is shown to have a fairly simple form and is characterized through aggregate equilibrium conditions, which may permit multiple solutions. We illustrate how these formal results can be used to analyze identification of link preferences and estimate or bound preference parameters. We also derive an analytical expression for agents’ welfare (expected surplus) from the structure of the network.</div></div>","PeriodicalId":15629,"journal":{"name":"Journal of Econometrics","volume":"253 ","pages":"Article 106174"},"PeriodicalIF":4.0,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145880566","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"经济学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-01DOI: 10.1016/j.jeconom.2026.106183
Xun Lu , Liangjun Su , Yinglong Ba
The widely-used common correlated effects (CCE) estimator, pioneered by Pesaran (2006), is computed using least squares applied to auxiliary regressions where the observed regressors are augmented with cross-sectional averages of the dependent variable and regressors. However, the CCE estimator requires a crucial rank condition and becomes inconsistent when this condition is violated and the factor loadings of the x- and y -equations are correlated, causing an endogeneity issue. This paper proposes a generalized CCE (GCCE) estimator by augmenting the regression with both cross-sectional and time-series averages of the regressors. We argue that the time-series average can serve as “control variables” to address the endogeneity issue. We show that the GCCE and CCE estimators are asymptotically equivalent when the rank condition holds, and the GCCE estimator remains consistent even when the rank condition is violated under our “control variable” condition. Therefore, our GCCE estimator is doubly robust, achieving consistency under either the rank condition or the “control variable” condition. Furthermore, we propose a leave-one-out jackknife method to conduct valid inferences regardless of whether the rank condition holds. Monte Carlo simulations demonstrate excellent performance of our estimators and inference methods in finite samples. We apply our new methods to two datasets to estimate the production function and gravity equation.
{"title":"On generalized CCE estimation","authors":"Xun Lu , Liangjun Su , Yinglong Ba","doi":"10.1016/j.jeconom.2026.106183","DOIUrl":"10.1016/j.jeconom.2026.106183","url":null,"abstract":"<div><div>The widely-used common correlated effects (CCE) estimator, pioneered by Pesaran (2006), is computed using least squares applied to auxiliary regressions where the observed regressors are augmented with cross-sectional averages of the dependent variable and regressors. However, the CCE estimator requires a crucial rank condition and becomes inconsistent when this condition is violated and the factor loadings of the <em>x</em>- and <em>y</em> -equations are correlated, causing an endogeneity issue. This paper proposes a generalized CCE (GCCE) estimator by augmenting the regression with both cross-sectional and time-series averages of the regressors. We argue that the time-series average can serve as “control variables” to address the endogeneity issue. We show that the GCCE and CCE estimators are asymptotically equivalent when the rank condition holds, and the GCCE estimator remains consistent even when the rank condition is violated under our “control variable” condition. Therefore, our GCCE estimator is doubly robust, achieving consistency under either the rank condition or the “control variable” condition. Furthermore, we propose a leave-one-out jackknife method to conduct valid inferences regardless of whether the rank condition holds. Monte Carlo simulations demonstrate excellent performance of our estimators and inference methods in finite samples. We apply our new methods to two datasets to estimate the production function and gravity equation.</div></div>","PeriodicalId":15629,"journal":{"name":"Journal of Econometrics","volume":"253 ","pages":"Article 106183"},"PeriodicalIF":4.0,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145976964","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"经济学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-01DOI: 10.1016/j.jeconom.2025.106179
Xinbing Kong, Tong Zhang
This article introduces a nonlinear generalized matrix factor model, moving beyond the linear-Gaussian framework to accommodate a broader class of response models typically handled via logit, probit, Poisson, or Tobit structures. We introduce a novel Lagrange multiplier method carefully tailored to ensure that the penalized likelihood function is locally concave around the true factor and loading parameters. This leads to central limit theorems of the estimated factors and loadings which is nontrivial for nonlinear matrix factor modeling. We establish the convergence rates of the estimated factor and loading matrices for the generalized matrix factor model under general conditions that allow for correlations across samples, rows, and columns. We provide a model selection criterion to determine the numbers of row and column factors. Extensive simulation studies demonstrate the superiority in handling discrete and mixed-type variables of the generalized matrix factor model. An empirical data analysis of the company’s operating performance shows that the generalized matrix factor model does clustering and reconstruction well in the presence of discontinuous entries in the data matrix.
{"title":"Estimation and inference for large-dimensional generalized matrix factor models","authors":"Xinbing Kong, Tong Zhang","doi":"10.1016/j.jeconom.2025.106179","DOIUrl":"10.1016/j.jeconom.2025.106179","url":null,"abstract":"<div><div>This article introduces a nonlinear generalized matrix factor model, moving beyond the linear-Gaussian framework to accommodate a broader class of response models typically handled via logit, probit, Poisson, or Tobit structures. We introduce a novel Lagrange multiplier method carefully tailored to ensure that the penalized likelihood function is locally concave around the true factor and loading parameters. This leads to central limit theorems of the estimated factors and loadings which is nontrivial for nonlinear matrix factor modeling. We establish the convergence rates of the estimated factor and loading matrices for the generalized matrix factor model under general conditions that allow for correlations across samples, rows, and columns. We provide a model selection criterion to determine the numbers of row and column factors. Extensive simulation studies demonstrate the superiority in handling discrete and mixed-type variables of the generalized matrix factor model. An empirical data analysis of the company’s operating performance shows that the generalized matrix factor model does clustering and reconstruction well in the presence of discontinuous entries in the data matrix.</div></div>","PeriodicalId":15629,"journal":{"name":"Journal of Econometrics","volume":"253 ","pages":"Article 106179"},"PeriodicalIF":4.0,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145938557","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"经济学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-01DOI: 10.1016/j.jeconom.2026.106185
Wayne Yuan Gao , Rui Wang
This paper provides a general identification approach for a wide range of nonlinear panel data models, including binary choice, ordered response, and other types of limited dependent variable models. Our approach accommodates dynamic models with any number of lagged dependent variables as well as other types of endogenous covariates. Our identification strategy relies on a partial stationarity condition, which allows for not only an unknown distribution of errors, but also temporal dependencies in errors. We derive partial identification results under flexible model specifications and establish sharpness of our identified set in the binary choice setting. We demonstrate the robust finite-sample performance of our approach using Monte Carlo simulations, and apply the approach to the empirical analysis of income categories using various ordered choice models.
{"title":"Identification in nonlinear dynamic panel models under partial stationarity","authors":"Wayne Yuan Gao , Rui Wang","doi":"10.1016/j.jeconom.2026.106185","DOIUrl":"10.1016/j.jeconom.2026.106185","url":null,"abstract":"<div><div>This paper provides a general identification approach for a wide range of nonlinear panel data models, including binary choice, ordered response, and other types of limited dependent variable models. Our approach accommodates dynamic models with any number of lagged dependent variables as well as other types of endogenous covariates. Our identification strategy relies on a partial stationarity condition, which allows for not only an unknown distribution of errors, but also temporal dependencies in errors. We derive partial identification results under flexible model specifications and establish sharpness of our identified set in the binary choice setting. We demonstrate the robust finite-sample performance of our approach using Monte Carlo simulations, and apply the approach to the empirical analysis of income categories using various ordered choice models.</div></div>","PeriodicalId":15629,"journal":{"name":"Journal of Econometrics","volume":"253 ","pages":"Article 106185"},"PeriodicalIF":4.0,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145976961","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"经济学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-01DOI: 10.1016/j.jeconom.2025.106177
Shuping Shi , Peter C.B. Phillips
Asset prices are commonly represented as a drift-diffusion process, wherein the drift component denotes the anticipated return of the asset within some time frame, while the diffusion component accommodates random shocks. The drift component has substantial practical significance but accurate estimation is typically challenging and has met with limited success in the existing literature except over large time spans. This paper explores a comprehensive range of drift-diffusion models that include constant, linear, trending, and bursting drift. Conditions are identified under which realized squared drift is a reliable tool for gauging integrated squared drift when the time span Tn is large enough. The recently introduced drift-robust quarticity estimator is found to retain consistency under twin asymptotics with Tn → ∞ and infill Δn → 0, subject to some constraints on the divergence rate of Tn across different drift specifications. An inferential method of detecting nonzero drift using and is proposed and the drift tests are shown to be consistent under different data generating processes with various conditions on Tn. Simulation studies reveal excellent performance of the realized squared drift measure and the drift test in finite samples. The drift test is demonstrated empirically in real-time surveillance of market abnormalities in the Nasdaq Composite Index over two notable sample periods: the dotcom bubble (1996–2003) and the artificial intelligence boom (2016–2024), using intraday data.
{"title":"Uncovering mild drift in asset prices with intraday high-frequency data","authors":"Shuping Shi , Peter C.B. Phillips","doi":"10.1016/j.jeconom.2025.106177","DOIUrl":"10.1016/j.jeconom.2025.106177","url":null,"abstract":"<div><div>Asset prices are commonly represented as a drift-diffusion process, wherein the drift component denotes the anticipated return of the asset within some time frame, while the diffusion component accommodates random shocks. The drift component has substantial practical significance but accurate estimation is typically challenging and has met with limited success in the existing literature except over large time spans. This paper explores a comprehensive range of drift-diffusion models that include constant, linear, trending, and bursting drift. Conditions are identified under which realized squared drift <span><math><mi>RSD</mi></math></span> is a reliable tool for gauging integrated squared drift when the time span <em>T<sub>n</sub></em> is large enough. The recently introduced drift-robust quarticity estimator <span><math><mi>RiceQ</mi></math></span> is found to retain consistency under twin asymptotics with <em>T<sub>n</sub></em> → ∞ and infill Δ<sub><em>n</em></sub> → 0, subject to some constraints on the divergence rate of <em>T<sub>n</sub></em> across different drift specifications. An inferential method of detecting nonzero drift using <span><math><mi>RSD</mi></math></span> and <span><math><mi>RiceQ</mi></math></span> is proposed and the drift tests are shown to be consistent under different data generating processes with various conditions on <em>T<sub>n</sub></em>. Simulation studies reveal excellent performance of the realized squared drift measure and the drift test in finite samples. The drift test is demonstrated empirically in real-time surveillance of market abnormalities in the Nasdaq Composite Index over two notable sample periods: the dotcom bubble (1996–2003) and the artificial intelligence boom (2016–2024), using intraday data.</div></div>","PeriodicalId":15629,"journal":{"name":"Journal of Econometrics","volume":"253 ","pages":"Article 106177"},"PeriodicalIF":4.0,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145880563","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"经济学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-01DOI: 10.1016/j.jeconom.2025.106166
Roy Allen
In a consideration set model, an individual maximizes utility among the considered alternatives. I relate an exogenous consideration set additive random utility model to classic discrete choice and the extended additive random utility model, in which utility can be for infeasible alternatives. When observable utility shifters are bounded, all three models are observationally equivalent. Moreover, they have the same counterfactual bounds and welfare formulas given variation in price-like utility indices. For attention interventions, welfare cannot change in the full consideration model but is completely unbounded in the limited consideration model. The identified set for consideration set probabilities has a minimal width for any bounded support of shifters, but with unbounded support it is a point: identification “towards” infinity does not resemble identification “at” infinity.
{"title":"Exogenous consideration and extended random utility","authors":"Roy Allen","doi":"10.1016/j.jeconom.2025.106166","DOIUrl":"10.1016/j.jeconom.2025.106166","url":null,"abstract":"<div><div>In a consideration set model, an individual maximizes utility among the considered alternatives. I relate an exogenous consideration set additive random utility model to classic discrete choice and the extended additive random utility model, in which utility can be <span><math><mrow><mo>−</mo><mi>∞</mi></mrow></math></span> for infeasible alternatives. When observable utility shifters are bounded, all three models are observationally equivalent. Moreover, they have the same counterfactual bounds and welfare formulas given variation in price-like utility indices. For attention interventions, welfare cannot change in the full consideration model but is completely unbounded in the limited consideration model. The identified set for consideration set probabilities has a minimal width for any bounded support of shifters, but with unbounded support it is a point: identification “towards” infinity does not resemble identification “at” infinity.</div></div>","PeriodicalId":15629,"journal":{"name":"Journal of Econometrics","volume":"253 ","pages":"Article 106166"},"PeriodicalIF":4.0,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145880564","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"经济学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-26DOI: 10.1016/j.jeconom.2025.106151
Daniel Ober-Reynolds
Missing data is pervasive in econometric applications, and rarely is it plausible that the data are missing (completely) at random. This paper proposes a methodology for studying the robustness of results drawn from incomplete datasets. Selection is measured as the divergence from the distribution of complete observations to the distribution of incomplete observations. The breakdown point is defined as the minimal amount of selection needed to overturn a given result. Reporting point estimates and lower confidence intervals of the breakdown point is a simple, concise way to communicate the robustness of a result. An estimator of the breakdown point is proposed and shown -consistent and asymptotically normal. This estimator can be applied directly to conclusions drawn from any model identified with the generalized method of moments (GMM) that satisfies mild assumptions. Simulations demonstrate the finite sample performance of the breakdown point estimator on averages, linear regression, and logistic regression. The methodology is illustrated by estimating the breakdown point of conclusions drawn from several randomized controlled trails suffering from missing data due to attrition.
{"title":"Robustness to missing data: breakdown point analysis","authors":"Daniel Ober-Reynolds","doi":"10.1016/j.jeconom.2025.106151","DOIUrl":"10.1016/j.jeconom.2025.106151","url":null,"abstract":"<div><div>Missing data is pervasive in econometric applications, and rarely is it plausible that the data are missing (completely) at random. This paper proposes a methodology for studying the robustness of results drawn from incomplete datasets. Selection is measured as the divergence from the distribution of complete observations to the distribution of incomplete observations. The <em>breakdown point</em> is defined as the minimal amount of selection needed to overturn a given result. Reporting point estimates and lower confidence intervals of the breakdown point is a simple, concise way to communicate the robustness of a result. An estimator of the breakdown point is proposed and shown <span><math><msqrt><mrow><mi>n</mi></mrow></msqrt></math></span>-consistent and asymptotically normal. This estimator can be applied directly to conclusions drawn from any model identified with the generalized method of moments (GMM) that satisfies mild assumptions. Simulations demonstrate the finite sample performance of the breakdown point estimator on averages, linear regression, and logistic regression. The methodology is illustrated by estimating the breakdown point of conclusions drawn from several randomized controlled trails suffering from missing data due to attrition.</div></div>","PeriodicalId":15629,"journal":{"name":"Journal of Econometrics","volume":"253 ","pages":"Article 106151"},"PeriodicalIF":4.0,"publicationDate":"2025-12-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145836612","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"经济学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-25DOI: 10.1016/j.jeconom.2025.106170
Chunrong Ai , Yue Fang , Haitian Xie
This paper studies policy learning for continuous treatments from observational data. Continuous treatments present more significant challenges than discrete ones because population welfare may need nonparametric estimation, and policy space may be infinite-dimensional and may satisfy shape restrictions. We propose to approximate the policy space with a sequence of finite-dimensional spaces and, for any given policy, obtain the empirical welfare by applying the kernel method. We consider two cases: known and unknown propensity scores. In the latter case, we allow for machine learning of the propensity score and modify the empirical welfare to account for the effect of machine learning. The learned policy maximizes the empirical welfare or the modified empirical welfare over the approximating space. In both cases, we modify the penalty algorithm proposed in Mbakop and Tabord-Meehan (2021) to data-automate the tuning parameters (i.e., bandwidth and dimension of the approximating space) and establish an oracle inequality for the welfare regret.
{"title":"Data-driven policy learning for continuous treatments","authors":"Chunrong Ai , Yue Fang , Haitian Xie","doi":"10.1016/j.jeconom.2025.106170","DOIUrl":"10.1016/j.jeconom.2025.106170","url":null,"abstract":"<div><div>This paper studies policy learning for continuous treatments from observational data. Continuous treatments present more significant challenges than discrete ones because population welfare may need nonparametric estimation, and policy space may be infinite-dimensional and may satisfy shape restrictions. We propose to approximate the policy space with a sequence of finite-dimensional spaces and, for any given policy, obtain the empirical welfare by applying the kernel method. We consider two cases: known and unknown propensity scores. In the latter case, we allow for machine learning of the propensity score and modify the empirical welfare to account for the effect of machine learning. The learned policy maximizes the empirical welfare or the modified empirical welfare over the approximating space. In both cases, we modify the penalty algorithm proposed in Mbakop and Tabord-Meehan (2021) to data-automate the tuning parameters (i.e., bandwidth and dimension of the approximating space) and establish an oracle inequality for the welfare regret.</div></div>","PeriodicalId":15629,"journal":{"name":"Journal of Econometrics","volume":"253 ","pages":"Article 106170"},"PeriodicalIF":4.0,"publicationDate":"2025-12-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145836613","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"经济学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}