Susan Athey , Guido W. Imbens , Jonas Metzger , Evan Munro
{"title":"Using Wasserstein Generative Adversarial Networks for the design of Monte Carlo simulations","authors":"Susan Athey , Guido W. Imbens , Jonas Metzger , Evan Munro","doi":"10.1016/j.jeconom.2020.09.013","DOIUrl":null,"url":null,"abstract":"<div><p><span><span>When researchers develop new econometric methods it is common practice to compare the performance of the new methods to those of existing methods in </span>Monte Carlo studies<span>. The credibility of such Monte Carlo studies is often limited because of the discretion the researcher has in choosing the Monte Carlo designs reported. To improve the credibility we propose using a class of generative models that has recently been developed in the machine learning literature, termed Generative Adversarial Networks (GANs) which can be used to systematically generate artificial data that closely mimics existing datasets. Thus, in combination with existing real data sets, GANs can be used to limit the degrees of freedom in Monte Carlo study designs for the researcher, making any comparisons more convincing. In addition, if an applied researcher is concerned with the performance of a particular statistical method on a specific data set (beyond its theoretical properties in large samples), she can use such GANs to assess the performance of the proposed method, </span></span><em>e.g.</em><span> the coverage rate of confidence intervals or the bias of the estimator, using simulated data<span> which closely resembles the exact setting of interest. To illustrate these methods we apply Wasserstein GANs (WGANs) to the estimation of average treatment effects. In this example, we find that </span></span><span><math><mrow><mo>(</mo><mi>i</mi><mo>)</mo></mrow></math></span> there is not a single estimator that outperforms the others in all three settings, so researchers should tailor their analytic approach to a given setting, <span><math><mrow><mo>(</mo><mi>i</mi><mi>i</mi><mo>)</mo></mrow></math></span> systematic simulation studies can be helpful for selecting among competing methods in this situation, and <span><math><mrow><mo>(</mo><mi>i</mi><mi>i</mi><mi>i</mi><mo>)</mo></mrow></math></span> the generated data closely resemble the actual data.</p></div>","PeriodicalId":15629,"journal":{"name":"Journal of Econometrics","volume":"240 2","pages":"Article 105076"},"PeriodicalIF":9.9000,"publicationDate":"2024-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Econometrics","FirstCategoryId":"96","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0304407621000440","RegionNum":3,"RegionCategory":"经济学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ECONOMICS","Score":null,"Total":0}
引用次数: 0
Abstract
When researchers develop new econometric methods it is common practice to compare the performance of the new methods to those of existing methods in Monte Carlo studies. The credibility of such Monte Carlo studies is often limited because of the discretion the researcher has in choosing the Monte Carlo designs reported. To improve the credibility we propose using a class of generative models that has recently been developed in the machine learning literature, termed Generative Adversarial Networks (GANs) which can be used to systematically generate artificial data that closely mimics existing datasets. Thus, in combination with existing real data sets, GANs can be used to limit the degrees of freedom in Monte Carlo study designs for the researcher, making any comparisons more convincing. In addition, if an applied researcher is concerned with the performance of a particular statistical method on a specific data set (beyond its theoretical properties in large samples), she can use such GANs to assess the performance of the proposed method, e.g. the coverage rate of confidence intervals or the bias of the estimator, using simulated data which closely resembles the exact setting of interest. To illustrate these methods we apply Wasserstein GANs (WGANs) to the estimation of average treatment effects. In this example, we find that there is not a single estimator that outperforms the others in all three settings, so researchers should tailor their analytic approach to a given setting, systematic simulation studies can be helpful for selecting among competing methods in this situation, and the generated data closely resemble the actual data.
期刊介绍:
The Journal of Econometrics serves as an outlet for important, high quality, new research in both theoretical and applied econometrics. The scope of the Journal includes papers dealing with identification, estimation, testing, decision, and prediction issues encountered in economic research. Classical Bayesian statistics, and machine learning methods, are decidedly within the range of the Journal''s interests. The Annals of Econometrics is a supplement to the Journal of Econometrics.