Pub Date : 2022-12-01DOI: 10.1177/1536867X221140943
Takuya Hasebe
In this article, I describe the commands that implement the estimation of three endogenous models of binary choice outcome. The command esbinary fits the endogenously switching model, where a potential outcome differs across two treatment states. The command edbinary fits the endogenous dummy model, which includes a dummy variable indicating the treatment state as one of the explanatory variables. After one estimates the parameters of these models, various treatment effects can be estimated as postestimation statistics. The command ssbinary fits the sample-selection model, where an outcome is observed in only one of the states. The commands fit these models using copula-based maximumlikelihood estimation.
{"title":"Endogenous models of binary choice outcomes: Copula-based maximum-likelihood estimation and treatment effects","authors":"Takuya Hasebe","doi":"10.1177/1536867X221140943","DOIUrl":"https://doi.org/10.1177/1536867X221140943","url":null,"abstract":"In this article, I describe the commands that implement the estimation of three endogenous models of binary choice outcome. The command esbinary fits the endogenously switching model, where a potential outcome differs across two treatment states. The command edbinary fits the endogenous dummy model, which includes a dummy variable indicating the treatment state as one of the explanatory variables. After one estimates the parameters of these models, various treatment effects can be estimated as postestimation statistics. The command ssbinary fits the sample-selection model, where an outcome is observed in only one of the states. The commands fit these models using copula-based maximumlikelihood estimation.","PeriodicalId":51171,"journal":{"name":"Stata Journal","volume":null,"pages":null},"PeriodicalIF":4.8,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47252421","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-12-01DOI: 10.1177/1536867X221140960
Guanpeng Yan, Qian Chen
The regression control method, also known as the panel-data approach for program evaluation (Hsiao, Ching, and Wan, 2012, Journal of Applied Econometrics 27: 705–740; Hsiao and Zhou, 2019, Journal of Applied Econometrics 34: 463–481), is a convenient method for causal inference in panel data that exploits cross-sectional correlation to construct counterfactual outcomes for a single treated unit by linear regression. In this article, we present the rcm command, which efficiently implements the regression control method with or without covariates. Available methods for model selection include best subset, lasso, and forward stepwise and backward stepwise regression, while available selection criteria include the corrected Akaike information criterion, the Akaike information criterion, the Bayesian information criterion, the modified Bayesian information criterion, and cross-validation. Estimation and counterfactual predictions can be made by ordinary least squares, lasso, or postlasso ordinary least squares. For statistical inference, both the in-space placebo test using fake treatment units and the in-time placebo test using a fake treatment time can be implemented. The rcm command produces a series of graphs for visualization along the way. We demonstrate the use of the rcm command by revisiting classic examples of political and economic integration between Hong Kong and mainland China (Hsiao, Ching, and Wan 2012) and German reunification (Abadie, Diamond, and Hainmueller, 2015, American Journal of Political Science 59: 495–510).
{"title":"rcm: A command for the regression control method","authors":"Guanpeng Yan, Qian Chen","doi":"10.1177/1536867X221140960","DOIUrl":"https://doi.org/10.1177/1536867X221140960","url":null,"abstract":"The regression control method, also known as the panel-data approach for program evaluation (Hsiao, Ching, and Wan, 2012, Journal of Applied Econometrics 27: 705–740; Hsiao and Zhou, 2019, Journal of Applied Econometrics 34: 463–481), is a convenient method for causal inference in panel data that exploits cross-sectional correlation to construct counterfactual outcomes for a single treated unit by linear regression. In this article, we present the rcm command, which efficiently implements the regression control method with or without covariates. Available methods for model selection include best subset, lasso, and forward stepwise and backward stepwise regression, while available selection criteria include the corrected Akaike information criterion, the Akaike information criterion, the Bayesian information criterion, the modified Bayesian information criterion, and cross-validation. Estimation and counterfactual predictions can be made by ordinary least squares, lasso, or postlasso ordinary least squares. For statistical inference, both the in-space placebo test using fake treatment units and the in-time placebo test using a fake treatment time can be implemented. The rcm command produces a series of graphs for visualization along the way. We demonstrate the use of the rcm command by revisiting classic examples of political and economic integration between Hong Kong and mainland China (Hsiao, Ching, and Wan 2012) and German reunification (Abadie, Diamond, and Hainmueller, 2015, American Journal of Political Science 59: 495–510).","PeriodicalId":51171,"journal":{"name":"Stata Journal","volume":null,"pages":null},"PeriodicalIF":4.8,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43707277","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-12-01DOI: 10.1177/1536867X221140932
N. Cox, S. Jenkins
Christopher F. (Kit) Baum was born in 1951 and grew up in Northern Michigan. He received degrees in economics from Kalamazoo College, Florida Atlantic University, and the University of Michigan in, respectively, 1972, 1973, and 1977. He joined the faculty at Boston College in 1977 and has been based there ever since, now as Professor of Economics and courtesy Professor of Social Work. He has chaired the Economics Department since 2018. Kit’s research ranges widely, with interests most recently focused on social epidemiology and health policy. His other research fields include time-series econometrics, financial markets, and macroeconomic policy. In his rare spare time, Kit enjoys foreign travel and outdoor recreation in Northern Michigan and the Adirondacks.
Christopher F.(Kit)Baum出生于1951年,在密歇根州北部长大。他分别于1972年、1973年和1977年获得卡拉马祖学院、佛罗里达大西洋大学和密歇根大学的经济学学位。1977年,他加入波士顿学院,此后一直在那里工作,现在是经济学教授和社会工作礼貌教授。他自2018年起担任经济学系主任。Kit的研究范围很广,最近的兴趣集中在社会流行病学和卫生政策上。他的其他研究领域包括时间序列计量经济学、金融市场和宏观经济政策。在难得的业余时间里,基特喜欢在密歇根州北部和阿迪朗达克地区进行国外旅行和户外娱乐。
{"title":"The Stata Journal Editors’ Prize 2022: Christopher F. Baum","authors":"N. Cox, S. Jenkins","doi":"10.1177/1536867X221140932","DOIUrl":"https://doi.org/10.1177/1536867X221140932","url":null,"abstract":"Christopher F. (Kit) Baum was born in 1951 and grew up in Northern Michigan. He received degrees in economics from Kalamazoo College, Florida Atlantic University, and the University of Michigan in, respectively, 1972, 1973, and 1977. He joined the faculty at Boston College in 1977 and has been based there ever since, now as Professor of Economics and courtesy Professor of Social Work. He has chaired the Economics Department since 2018. Kit’s research ranges widely, with interests most recently focused on social epidemiology and health policy. His other research fields include time-series econometrics, financial markets, and macroeconomic policy. In his rare spare time, Kit enjoys foreign travel and outdoor recreation in Northern Michigan and the Adirondacks.","PeriodicalId":51171,"journal":{"name":"Stata Journal","volume":null,"pages":null},"PeriodicalIF":4.8,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46394009","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-12-01DOI: 10.1177/1536867X221141002
N. Akhtar-Danesh, Stephen C. Wingreen
In this article, we introduce qpair as a new command written in Stata for the analysis of paired Q-sorts in Q-methodology, which is used for studying subjective issues and is a combination of qualitative and quantitative techniques. The quantitative component of Q-methodology employs a by-person factor analysis technique. However, currently there is no systematic approach for analyzing paired Q-sorts or longitudinal data in Q-methodology. We introduce the only statistical command available for the analysis of paired Q-sorts. The qpair command employs the factor extraction and factor rotation techniques in Stata. The command is illustrated using a dataset representing perceptions of 50 information technology professionals on person–organization fit regarding their training and development priorities.
{"title":"qpair: A command for analyzing paired Q-sorts in Q-methodology","authors":"N. Akhtar-Danesh, Stephen C. Wingreen","doi":"10.1177/1536867X221141002","DOIUrl":"https://doi.org/10.1177/1536867X221141002","url":null,"abstract":"In this article, we introduce qpair as a new command written in Stata for the analysis of paired Q-sorts in Q-methodology, which is used for studying subjective issues and is a combination of qualitative and quantitative techniques. The quantitative component of Q-methodology employs a by-person factor analysis technique. However, currently there is no systematic approach for analyzing paired Q-sorts or longitudinal data in Q-methodology. We introduce the only statistical command available for the analysis of paired Q-sorts. The qpair command employs the factor extraction and factor rotation techniques in Stata. The command is illustrated using a dataset representing perceptions of 50 information technology professionals on person–organization fit regarding their training and development priorities.","PeriodicalId":51171,"journal":{"name":"Stata Journal","volume":null,"pages":null},"PeriodicalIF":4.8,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42504110","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-12-01Epub Date: 2023-01-05DOI: 10.1177/1536867x221140953
John A Gallis, Xueqi Wang, Paul J Rathouz, John S Preisser, Fan Li, Elizabeth L Turner
Stepped wedge cluster randomized trials are increasingly being used to evaluate interventions in medical, public health, educational, and social science contexts. With the longitudinal and crossover nature of a SW-CRT, complex analysis techniques are often needed which makes appropriately powering SW-CRTs challenging. In this paper, we introduce a newly-developed SW-CRT power calculator, embedded within the power command in Stata. The power calculator assumes a marginal model (i.e., generalized estimating equations [GEE]) for the primary analysis of SW-CRTs, for which other currently available SW-CRT power calculators may not be suitable. The program accommodates complete cross-sectional and closed-cohort designs, and includes multilevel correlation structures appropriate for such designs. We discuss the methods and formulae underlying our SW-CRT calculator, and provide illustrative examples of the use of power swgee. We provide suggestions about the choice of parameters in power swgee, and conclude by discussing areas of future research which may improve the program.
在医疗、公共卫生、教育和社会科学领域,阶梯式楔形分组随机试验越来越多地被用于评估干预措施。由于阶梯式楔形集群随机试验具有纵向和交叉的特点,因此往往需要复杂的分析技术,这就给阶梯式楔形集群随机试验的适当加权带来了挑战。在本文中,我们将介绍一种新开发的 SW-CRT 功率计算器,它嵌入在 Stata 的功率命令中。功率计算器假定 SW-CRT 的主要分析采用边际模型(即广义估计方程 [GEE]),而目前可用的其他 SW-CRT 功率计算器可能不适合这种分析。该程序适用于完整的横断面设计和封闭队列设计,并包括适合此类设计的多层次相关结构。我们讨论了 SW-CRT 计算器的基本方法和公式,并提供了使用功率曲线的示例。我们对 power swgee 中参数的选择提出了建议,最后还讨论了未来研究中可能改进该程序的领域。
{"title":"power swgee: GEE-based power calculations in stepped wedge cluster randomized trials.","authors":"John A Gallis, Xueqi Wang, Paul J Rathouz, John S Preisser, Fan Li, Elizabeth L Turner","doi":"10.1177/1536867x221140953","DOIUrl":"10.1177/1536867x221140953","url":null,"abstract":"<p><p>Stepped wedge cluster randomized trials are increasingly being used to evaluate interventions in medical, public health, educational, and social science contexts. With the longitudinal and crossover nature of a SW-CRT, complex analysis techniques are often needed which makes appropriately powering SW-CRTs challenging. In this paper, we introduce a newly-developed SW-CRT power calculator, embedded within the power command in Stata. The power calculator assumes a marginal model (i.e., generalized estimating equations [GEE]) for the primary analysis of SW-CRTs, for which other currently available SW-CRT power calculators may not be suitable. The program accommodates complete cross-sectional and closed-cohort designs, and includes multilevel correlation structures appropriate for such designs. We discuss the methods and formulae underlying our SW-CRT calculator, and provide illustrative examples of the use of power swgee. We provide suggestions about the choice of parameters in power swgee, and conclude by discussing areas of future research which may improve the program.</p>","PeriodicalId":51171,"journal":{"name":"Stata Journal","volume":null,"pages":null},"PeriodicalIF":4.8,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10035664/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9197626","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-12-01DOI: 10.1177/1536867X221141022
Max D. Weinreb, J. Trinitapoli
In this article, we introduce the printcase command, which outputs data from a specific observation into an easy-to-read Microsoft Word or PDF document. printcase allows analysts to focus on a single observation within a dataset and view that observation in its entirety. The output displays fields in table format, with all variables identified by their corresponding labels and all responses identified by their corresponding value labels. We explain how printcase works, give examples of circumstances under which this type of table-based quasiquestionnaire would be useful, and provide code for printing single observations.
{"title":"printcase: A command for visualizing single observations","authors":"Max D. Weinreb, J. Trinitapoli","doi":"10.1177/1536867X221141022","DOIUrl":"https://doi.org/10.1177/1536867X221141022","url":null,"abstract":"In this article, we introduce the printcase command, which outputs data from a specific observation into an easy-to-read Microsoft Word or PDF document. printcase allows analysts to focus on a single observation within a dataset and view that observation in its entirety. The output displays fields in table format, with all variables identified by their corresponding labels and all responses identified by their corresponding value labels. We explain how printcase works, give examples of circumstances under which this type of table-based quasiquestionnaire would be useful, and provide code for printing single observations.","PeriodicalId":51171,"journal":{"name":"Stata Journal","volume":null,"pages":null},"PeriodicalIF":4.8,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43418647","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-12-01DOI: 10.1177/1536867X221141012
Xiangmei Ma, Y. Cheung
We describe five asymptotically unbiased estimators of intervention effects on event rates in nonmatched and matched-pair cluster randomized trials, and we present a bias-corrected version of the estimators for use when the number of clusters is small. The estimators are the ratio of mean counts (r 1), ratio of mean cluster-level event rates (r 2), ratio of event rates (r 3), double ratio of counts (r 4), and double ratio of event rates (r 5). r 1, r 2, and r 3 estimate the total effect, which comprises the direct and indirect effects; r 4 and r 5 estimate the direct effect. We describe a new command, crtrest, that provides these ratio estimators and their standard errors in nonmatched and matched-pair cluster randomized trials.
{"title":"crtrest: A command for ratio estimators of intervention effects on event rates in cluster randomized trials","authors":"Xiangmei Ma, Y. Cheung","doi":"10.1177/1536867X221141012","DOIUrl":"https://doi.org/10.1177/1536867X221141012","url":null,"abstract":"We describe five asymptotically unbiased estimators of intervention effects on event rates in nonmatched and matched-pair cluster randomized trials, and we present a bias-corrected version of the estimators for use when the number of clusters is small. The estimators are the ratio of mean counts (r 1), ratio of mean cluster-level event rates (r 2), ratio of event rates (r 3), double ratio of counts (r 4), and double ratio of event rates (r 5). r 1, r 2, and r 3 estimate the total effect, which comprises the direct and indirect effects; r 4 and r 5 estimate the direct effect. We describe a new command, crtrest, that provides these ratio estimators and their standard errors in nonmatched and matched-pair cluster randomized trials.","PeriodicalId":51171,"journal":{"name":"Stata Journal","volume":null,"pages":null},"PeriodicalIF":4.8,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46106720","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-12-01DOI: 10.1177/1536867X221141058
N. Cox
Two common problems with graph axis labels are to decide in advance on some “nice” numbers to use on one or both axes and to show particular labels on some transformed scale. In this column, I discuss the nicelabels and mylabels commands, which address these problems. The first command is new to Stata, and the second is a revision of a previously published command. I also survey the myticks command for tick placement. In all commands, the main output is a local macro in the calling program’s space, in the interest of promoting automation in do-files and programs.
{"title":"Speaking Stata: Automating axis labels: Nice numbers and transformed scales","authors":"N. Cox","doi":"10.1177/1536867X221141058","DOIUrl":"https://doi.org/10.1177/1536867X221141058","url":null,"abstract":"Two common problems with graph axis labels are to decide in advance on some “nice” numbers to use on one or both axes and to show particular labels on some transformed scale. In this column, I discuss the nicelabels and mylabels commands, which address these problems. The first command is new to Stata, and the second is a revision of a previously published command. I also survey the myticks command for tick placement. In all commands, the main output is a local macro in the calling program’s space, in the interest of promoting automation in do-files and programs.","PeriodicalId":51171,"journal":{"name":"Stata Journal","volume":null,"pages":null},"PeriodicalIF":4.8,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44861450","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-12-01DOI: 10.1177/1536867X221141068
N. Cox
Searching for particular text within strings is a common data management problem. One frequent context is whenever various possible answers to a question are bundled together in values of a string variable. Suppose people are asked which sports they enjoy or something more interesting, like which statistical software they use routinely. To keep the matter simple, we will first imagine just lists of one or more numbers that are concise codes for distinct answers, say, "42" for "cricket" or "1" for "Stata". Nonnumeric codes will also be considered in due course. For more on handling such questions, sometimes called multiple response, see Cox and Kohler (2003) or Jann (2005).
{"title":"Stata tip 148: Searching for words within strings","authors":"N. Cox","doi":"10.1177/1536867X221141068","DOIUrl":"https://doi.org/10.1177/1536867X221141068","url":null,"abstract":"Searching for particular text within strings is a common data management problem. One frequent context is whenever various possible answers to a question are bundled together in values of a string variable. Suppose people are asked which sports they enjoy or something more interesting, like which statistical software they use routinely. To keep the matter simple, we will first imagine just lists of one or more numbers that are concise codes for distinct answers, say, \"42\" for \"cricket\" or \"1\" for \"Stata\". Nonnumeric codes will also be considered in due course. For more on handling such questions, sometimes called multiple response, see Cox and Kohler (2003) or Jann (2005).","PeriodicalId":51171,"journal":{"name":"Stata Journal","volume":null,"pages":null},"PeriodicalIF":4.8,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44872513","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-09-01DOI: 10.1177/1536867X221124466
Sven-Kristjan Bormann
In this article, I introduce new commands to calculate second-generation p-values (SGPVs) for common estimation commands in Stata. The sgpv command and its companions allow the easy calculation of SGPVs and their associated diagnostics, as well as the plotting of SGPVs against the standard p-values.
{"title":"A Stata implementation of second-generation p-values","authors":"Sven-Kristjan Bormann","doi":"10.1177/1536867X221124466","DOIUrl":"https://doi.org/10.1177/1536867X221124466","url":null,"abstract":"In this article, I introduce new commands to calculate second-generation p-values (SGPVs) for common estimation commands in Stata. The sgpv command and its companions allow the easy calculation of SGPVs and their associated diagnostics, as well as the plotting of SGPVs against the standard p-values.","PeriodicalId":51171,"journal":{"name":"Stata Journal","volume":null,"pages":null},"PeriodicalIF":4.8,"publicationDate":"2022-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42588398","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}