Pub Date : 2023-09-01DOI: 10.1177/1536867x231196488
Mark D. Chatfield, Tim J. Cole, Henrica C. W. de Vet, Louise Marquart-Wilson, Daniel M. Farewell
Bland–Altman plots can be useful in paired data settings such as measurement-method comparison studies. A Bland–Altman plot has differences, percentage differences, or ratios on the y axis and a mean of the data pairs on the x axis, with 95% limits of agreement indicating the central 95% range of differences, percentage differences, or ratios. This range can vary with the mean. We introduce the community-contributed blandaltman command, which uniquely in Stata can 1) create Bland–Altman plots featuring ratios in addition to differences and percentage differences, 2) allow the limits of agreement for ratios and percentage differences to vary as a function of the mean, and 3) add confidence intervals, prediction intervals, and tolerance intervals to the plots.
{"title":"blandaltman: A command to create variants of Bland–Altman plots","authors":"Mark D. Chatfield, Tim J. Cole, Henrica C. W. de Vet, Louise Marquart-Wilson, Daniel M. Farewell","doi":"10.1177/1536867x231196488","DOIUrl":"https://doi.org/10.1177/1536867x231196488","url":null,"abstract":"Bland–Altman plots can be useful in paired data settings such as measurement-method comparison studies. A Bland–Altman plot has differences, percentage differences, or ratios on the y axis and a mean of the data pairs on the x axis, with 95% limits of agreement indicating the central 95% range of differences, percentage differences, or ratios. This range can vary with the mean. We introduce the community-contributed blandaltman command, which uniquely in Stata can 1) create Bland–Altman plots featuring ratios in addition to differences and percentage differences, 2) allow the limits of agreement for ratios and percentage differences to vary as a function of the mean, and 3) add confidence intervals, prediction intervals, and tolerance intervals to the plots.","PeriodicalId":51171,"journal":{"name":"Stata Journal","volume":"58 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135429099","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-09-01DOI: 10.1177/1536867x231195278
Guanpeng Yan, Qiang Chen
The synthetic control method (Abadie and Gardeazabal, 2003, American Economic Review 93: 113–132, Abadie, Diamond, and Hainmueller, 2010, Journal of the American Statistical Association 105: 493–505) is a popular method for causal inference in panel data with one treated unit that often uses placebo tests for statistical inference. While the synthetic control method can be implemented by the excellent command synth (Abadie, Diamond, and Hainmueller, 2011, Statistical Software Components S457334, Department of Economics, Boston College), it is still inconvenient for users to conduct placebo tests. As a wrapper program for synth, our proposed synth2 command provides convenient utilities to automate both in-space and in-time placebo tests, as well as the leave-one-out robustness test. Moreover, synth2 produces a complete set of graphs to visualize covariate or unit weights, covariate balance, actual or predicted outcomes, treatment effects, placebo tests, ratio of posttreatment mean squared prediction error to pretreatment mean squared prediction error, pointwise p-values (two-sided, right-sided, and left-sided), and the leave-one-out robustness test. We illustrate the use of the synth2 command by revisiting the classic example of California’s tobacco control program (Abadie, Diamond, and Hainmueller 2010).
综合控制法(Abadie and Gardeazabal, 2003,《美国经济评论》93:113-132;Abadie, Diamond, and Hainmueller, 2010,《美国统计协会杂志》105:493-505)是一种常用的面板数据因果推断方法,其中一个处理单元通常使用安慰剂检验进行统计推断。虽然合成控制方法可以通过优秀的命令synth来实现(Abadie, Diamond, and Hainmueller, 2011, Statistical Software Components S457334, Department of Economics, Boston College),但对于用户进行安慰剂测试仍然不方便。作为synth的包装程序,我们提出的synth2命令提供了方便的实用程序,可以自动执行空间内和时间内的安慰剂测试,以及略去一个稳健性测试。此外,synth2还生成了一套完整的图表,用于可视化协变量或单位权重、协变量平衡、实际或预测结果、治疗效果、安慰剂检验、治疗后均方预测误差与预处理均方预测误差的比值、点向p值(双侧、右侧和左侧)以及留一鲁棒性检验。我们通过回顾加州烟草控制项目的经典示例来说明synth2命令的使用(Abadie, Diamond, and Hainmueller 2010)。
{"title":"synth2: Synthetic control method with placebo tests, robustness test, and visualization","authors":"Guanpeng Yan, Qiang Chen","doi":"10.1177/1536867x231195278","DOIUrl":"https://doi.org/10.1177/1536867x231195278","url":null,"abstract":"The synthetic control method (Abadie and Gardeazabal, 2003, American Economic Review 93: 113–132, Abadie, Diamond, and Hainmueller, 2010, Journal of the American Statistical Association 105: 493–505) is a popular method for causal inference in panel data with one treated unit that often uses placebo tests for statistical inference. While the synthetic control method can be implemented by the excellent command synth (Abadie, Diamond, and Hainmueller, 2011, Statistical Software Components S457334, Department of Economics, Boston College), it is still inconvenient for users to conduct placebo tests. As a wrapper program for synth, our proposed synth2 command provides convenient utilities to automate both in-space and in-time placebo tests, as well as the leave-one-out robustness test. Moreover, synth2 produces a complete set of graphs to visualize covariate or unit weights, covariate balance, actual or predicted outcomes, treatment effects, placebo tests, ratio of posttreatment mean squared prediction error to pretreatment mean squared prediction error, pointwise p-values (two-sided, right-sided, and left-sided), and the leave-one-out robustness test. We illustrate the use of the synth2 command by revisiting the classic example of California’s tobacco control program (Abadie, Diamond, and Hainmueller 2010).","PeriodicalId":51171,"journal":{"name":"Stata Journal","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135428959","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-09-01DOI: 10.1177/1536867x231196480
John Gregson, João Pedro Ferreira, Tim Collier
The win ratio is a statistical method most commonly used for analyzing composite outcomes in clinical trials. Composite outcomes comprise two or more distinct “component” events (for example, myocardial infarction or death) and are typically analyzed using time-to-first-event methods ignoring the relative importance of the component events. When using the win ratio, component events are instead placed into a hierarchy from most to least important; more important components can then be prioritized over less important outcomes (for example, death can be prioritized over myocardial infarction). Furthermore, the win ratio enables outcomes of different types (for example, time-to-event, continuous, binary, ordinal, and repeat events) to be combined. We present winratiotest, a command to implement the win-ratio approach for hierarchical outcomes in a flexible and user-friendly way.
{"title":"winratiotest: A command for implementing the win ratio and stratified win ratio in Stata","authors":"John Gregson, João Pedro Ferreira, Tim Collier","doi":"10.1177/1536867x231196480","DOIUrl":"https://doi.org/10.1177/1536867x231196480","url":null,"abstract":"The win ratio is a statistical method most commonly used for analyzing composite outcomes in clinical trials. Composite outcomes comprise two or more distinct “component” events (for example, myocardial infarction or death) and are typically analyzed using time-to-first-event methods ignoring the relative importance of the component events. When using the win ratio, component events are instead placed into a hierarchy from most to least important; more important components can then be prioritized over less important outcomes (for example, death can be prioritized over myocardial infarction). Furthermore, the win ratio enables outcomes of different types (for example, time-to-event, continuous, binary, ordinal, and repeat events) to be combined. We present winratiotest, a command to implement the win-ratio approach for hierarchical outcomes in a flexible and user-friendly way.","PeriodicalId":51171,"journal":{"name":"Stata Journal","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135429097","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-09-01Epub Date: 2023-09-22DOI: 10.1177/1536867X231196294
Jennifer A Thompson, Baptiste Leurent, Stephen Nash, Lawrence H Moulton, Richard J Hayes
In this article, we introduce a new command, clan, that conducts a cluster-level analysis of cluster randomized trials. The command simplifies adjusting for individual- and cluster-level covariates and can also account for a stratified design. It can be used to analyze a continuous, binary, or rate outcome.
{"title":"Cluster randomized controlled trial analysis at the cluster level: The clan command.","authors":"Jennifer A Thompson, Baptiste Leurent, Stephen Nash, Lawrence H Moulton, Richard J Hayes","doi":"10.1177/1536867X231196294","DOIUrl":"10.1177/1536867X231196294","url":null,"abstract":"<p><p>In this article, we introduce a new command, clan, that conducts a cluster-level analysis of cluster randomized trials. The command simplifies adjusting for individual- and cluster-level covariates and can also account for a stratified design. It can be used to analyze a continuous, binary, or rate outcome.</p>","PeriodicalId":51171,"journal":{"name":"Stata Journal","volume":"23 3","pages":"754-773"},"PeriodicalIF":4.8,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7615216/pdf/EMS189331.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41240748","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-09-01DOI: 10.1177/1536867x231195286
Nicolai Suppa
In this article, I present mpitb, a toolbox for multidimensional poverty indices (MPIs). The package mpitb comprises several subcommands to facilitate specification, estimation, and analyses of MPIs and supports the popular Alkire– Foster framework to multidimensional poverty measurement. mpitb offers several benefits to researchers, analysts, and practitioners working on MPIs, including substantial time savings (for example, because of lower data management and programming requirements) while allowing for a more comprehensive analysis at the same time. Aside from various convenience functions, mpitb also provides low-level tools for advanced users and programmers.
{"title":"mpitb: A toolbox for multidimensional poverty indices","authors":"Nicolai Suppa","doi":"10.1177/1536867x231195286","DOIUrl":"https://doi.org/10.1177/1536867x231195286","url":null,"abstract":"In this article, I present mpitb, a toolbox for multidimensional poverty indices (MPIs). The package mpitb comprises several subcommands to facilitate specification, estimation, and analyses of MPIs and supports the popular Alkire– Foster framework to multidimensional poverty measurement. mpitb offers several benefits to researchers, analysts, and practitioners working on MPIs, including substantial time savings (for example, because of lower data management and programming requirements) while allowing for a more comprehensive analysis at the same time. Aside from various convenience functions, mpitb also provides low-level tools for advanced users and programmers.","PeriodicalId":51171,"journal":{"name":"Stata Journal","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135428771","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-09-01DOI: 10.1177/1536867x231196349
Carlo Schwarz
In this article, I introduce new commands to estimate text regressions for continuous, binary, and categorical variables based on text strings. The command txtreg_train automatically handles text cleaning, tokenization, model training, and cross-validation for lasso, ridge, elastic-net, and regularized logistic regressions. The txtreg_predict command obtains the predictions from the trained text regression model. Furthermore, the txtreg_analyze command facilitates the analysis of the coefficients of the text regression model. Together, these commands provide a convenient toolbox for researchers to train text regressions. They also allow sharing of pretrained text regression models with other researchers.
{"title":"Estimating text regressions using txtreg_train","authors":"Carlo Schwarz","doi":"10.1177/1536867x231196349","DOIUrl":"https://doi.org/10.1177/1536867x231196349","url":null,"abstract":"In this article, I introduce new commands to estimate text regressions for continuous, binary, and categorical variables based on text strings. The command txtreg_train automatically handles text cleaning, tokenization, model training, and cross-validation for lasso, ridge, elastic-net, and regularized logistic regressions. The txtreg_predict command obtains the predictions from the trained text regression model. Furthermore, the txtreg_analyze command facilitates the analysis of the coefficients of the text regression model. Together, these commands provide a convenient toolbox for researchers to train text regressions. They also allow sharing of pretrained text regression models with other researchers.","PeriodicalId":51171,"journal":{"name":"Stata Journal","volume":"368 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135428780","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-09-01DOI: 10.1177/1536867x231196291
Stefan Tübbicke
Interest in evaluating dose–response functions of continuous treatments has been increasing recently. To facilitate the estimation of causal effects in this setting, I introduce the ebct command for the estimation of dose–response functions and their derivatives using entropy balancing for continuous treatments. First, balancing weights are estimated by numerically solving a globally convex optimization problem. These weights eradicate Pearson correlations between covariates and the treatment variable. Because simple uncorrelatedness may be insufficient to yield consistent estimates in the next step, higher moments of the treatment variable can be rendered uncorrelated with covariates. Second, the weights are used in local linear kernel regressions to estimate the dose–response function or its derivative. To perform statistical inference, I use a bootstrap procedure. The command also provides the option of producing publication-quality graphs for the estimated relationships.
{"title":"ebct: Using entropy balancing for continuous treatments to estimate dose–response functions and their derivatives","authors":"Stefan Tübbicke","doi":"10.1177/1536867x231196291","DOIUrl":"https://doi.org/10.1177/1536867x231196291","url":null,"abstract":"Interest in evaluating dose–response functions of continuous treatments has been increasing recently. To facilitate the estimation of causal effects in this setting, I introduce the ebct command for the estimation of dose–response functions and their derivatives using entropy balancing for continuous treatments. First, balancing weights are estimated by numerically solving a globally convex optimization problem. These weights eradicate Pearson correlations between covariates and the treatment variable. Because simple uncorrelatedness may be insufficient to yield consistent estimates in the next step, higher moments of the treatment variable can be rendered uncorrelated with covariates. Second, the weights are used in local linear kernel regressions to estimate the dose–response function or its derivative. To perform statistical inference, I use a bootstrap procedure. The command also provides the option of producing publication-quality graphs for the estimated relationships.","PeriodicalId":51171,"journal":{"name":"Stata Journal","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135428960","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-09-01DOI: 10.1177/1536867x231195288
Roger B. Newson, Milena Falcaro
Logistic and probit models are the most popular regression models for binary outcomes. A simple robust alternative is the robit model, which replaces the underlying normal distribution in the probit model with a Student’s t distribution. The heavier tails of the t distribution (compared with the normal distribution) mean that model outliers are less influential. Robit regression models can be fit as generalized linear models with the link function defined as the inverse cumulative t distribution function with a specified number of degrees of freedom; they have been advocated as being particularly suitable for estimating inverse-probability weights and propensity scoring more generally. Here we describe a new command, robit, that implements robit regression in Stata.
{"title":"Robit regression in Stata","authors":"Roger B. Newson, Milena Falcaro","doi":"10.1177/1536867x231195288","DOIUrl":"https://doi.org/10.1177/1536867x231195288","url":null,"abstract":"Logistic and probit models are the most popular regression models for binary outcomes. A simple robust alternative is the robit model, which replaces the underlying normal distribution in the probit model with a Student’s t distribution. The heavier tails of the t distribution (compared with the normal distribution) mean that model outliers are less influential. Robit regression models can be fit as generalized linear models with the link function defined as the inverse cumulative t distribution function with a specified number of degrees of freedom; they have been advocated as being particularly suitable for estimating inverse-probability weights and propensity scoring more generally. Here we describe a new command, robit, that implements robit regression in Stata.","PeriodicalId":51171,"journal":{"name":"Stata Journal","volume":"103 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135428778","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-09-01DOI: 10.1177/1536867x231196441
Jerônimo Oliveira Muniz
One way to estimate mortality in countries with incomplete data is to utilize intercensal methods, which do not require model life tables and provide accurate results even in the presence of age distortions and death underregistration. In this article, I revisit three of these techniques (census based, death distribution, and an iterative procedure) and introduce ilt, a command to calculate singledecrement life tables and the net flow of migrants by age. The required inputs are two age-specific population distributions and the average number of deaths between them. The empirical example draws on data from Vietnam, but the methods are extendable to any context and period.
{"title":"Iterative intercensal single-decrement life tables using Stata","authors":"Jerônimo Oliveira Muniz","doi":"10.1177/1536867x231196441","DOIUrl":"https://doi.org/10.1177/1536867x231196441","url":null,"abstract":"One way to estimate mortality in countries with incomplete data is to utilize intercensal methods, which do not require model life tables and provide accurate results even in the presence of age distortions and death underregistration. In this article, I revisit three of these techniques (census based, death distribution, and an iterative procedure) and introduce ilt, a command to calculate singledecrement life tables and the net flow of migrants by age. The required inputs are two age-specific population distributions and the average number of deaths between them. The empirical example draws on data from Vietnam, but the methods are extendable to any context and period.","PeriodicalId":51171,"journal":{"name":"Stata Journal","volume":"58 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135428773","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Review of Multilevel and Longitudinal Modeling Using Stata, Fourth Edition, by Sophia Rabe-Hesketh and Anders Skrondal","authors":"Leonardo Grilli, Carla Rampichini","doi":"10.1177/1536867x231196518","DOIUrl":"https://doi.org/10.1177/1536867x231196518","url":null,"abstract":"This article reviews Multilevel and Longitudinal Modeling Using Stata, Fourth Edition, by Rabe-Hesketh and Skrondal (2022, Stata Press).","PeriodicalId":51171,"journal":{"name":"Stata Journal","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135429100","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}