Edward C. Norton PhD, Bryan E. Dowd PhD, Melissa M. Garrido PhD, Matthew L. Maciejewski PhD
{"title":"Requiem for odds ratios","authors":"Edward C. Norton PhD, Bryan E. Dowd PhD, Melissa M. Garrido PhD, Matthew L. Maciejewski PhD","doi":"10.1111/1475-6773.14337","DOIUrl":null,"url":null,"abstract":"<p><i>Health Services Research</i> encourages authors to report marginal effects instead of odds ratios for logistic regression with a binary outcome. Specifically, in the instructions for authors, Manuscript Formatting and Submission Requirements, section 2.4.2.2 Structured abstract and keywords, it reads “Reporting of odds ratios is discouraged (marginal effects preferred) except in case-control studies” (see the <i>HSR</i> website https://www.hsr.org/authors/manuscript-formatting-submission-requirements).</p><p>We applaud this decision. We also encourage other journals to make the same decision. It is time to end the reporting of odds ratios in the scientific literature for most research studies, except for case–control studies with matched samples.</p><p><i>HSR</i>'s decision is due to increasing recognition that odds ratios are not only confusing to non-researchers,<span><sup>1, 2</sup></span> but that researchers themselves often misinterpret them.<span><sup>3, 4</sup></span> Odds ratios are also of limited utility in meta-analyses. Marginal effects, which represent the difference in the probability of a binary outcome between comparison groups, are more straightforward to interpret and compare. Below, we illustrate the difficulties in interpreting odds ratios, outline the conditions that must be met for odds ratios to be compared directly, and explain how marginal effects overcome these difficulties.</p><p>Consider a hypothetical prospective cohort study of whether a new hospital-based discharge program affects the 30-day readmission rate, a binary outcome, observed for each patient who is discharged alive. The program's goal is to help eligible patients avoid unnecessary readmissions, and patients are randomized into participating in the program or not. Suppose that a carefully designed study estimates the logistic regression coefficient (the log odds) on the discharge program to be <span></span><math>\n <mrow>\n <mo>−</mo>\n <mn>0.2</mn>\n </mrow></math>, indicating that readmission rates are lower for patients who participate in the discharge program than patients who do not. When writing about the results, the researcher must decide how to report the magnitude of the change and has several choices for how to do so.</p><p>One option is to report the odds ratio, which in this case is <span></span><math>\n <mrow>\n <mn>0.82</mn>\n <mo>=</mo>\n <mi>exp</mi>\n <mfenced>\n <mrow>\n <mo>−</mo>\n <mn>0.2</mn>\n </mrow>\n </mfenced>\n </mrow></math>, and then compare it with other published odds ratios in the literature. However, this estimated odds ratio of 0.82 depends on an unobservable scaling factor that makes its interpretation conditional on the data and on the model specification.<span><sup>3, 5</sup></span> As odds ratios are scaled by different unobservable factors and are conditional on different model specifications, the estimated odds ratio cannot be compared with any other odds ratio.<span><sup>6, 7</sup></span> Even within a single study, odds ratios based on models including different sets of covariates cannot be compared. It would be more accurate to report that, “The estimated odds ratio is 0.82, conditional on the covariates included in the regression, but a different odds ratio would be found if the model included a different set of explanatory variables.” Due to an unobserved scaling factor that is included in every estimated odds ratio, odds ratios are not generalizable.</p><p>Odds ratios from different covariate specifications within the same study or between different studies can almost never be compared directly. The explanation for this requires an understanding of how logistic regression differs from linear regression.<span><sup>3</sup></span> In least squares regression, adding covariates that predict the outcome—but are independent of other covariates (and are therefore not mediators or confounders)—does not change either the estimated parameters or the marginal effects. Adding more independent covariates to a linear regression just reduces the amount of unexplained variation, which reduces the error variance (<span></span><math>\n <mrow>\n <msup>\n <mi>σ</mi>\n <mn>2</mn>\n </msup>\n </mrow></math>), and results in smaller standard errors for each parameter or marginal effect because of improved precision. For example, in a perfectly executed randomized controlled trial (RCT), the assignment to treatment is independent of all covariates, and the covariates are balanced in the treatment and comparison groups. In a perfectly executed RCT, the estimated treatment effect from a least squares regression should be the same whether covariates are included or not. The only difference in the estimated treatment effect with or without covariate adjustment is the standard errors. Including covariates corrects for any imbalance in the covariates resulting from sampling variation. Adding covariates thus improves statistical significance while leaving the expected value of the estimated treatment effects unchanged.</p><p>This result does not carry over to logistic regression (or to probit regression). In contrast to linear regression applied to the RCT, adding covariates will change the estimated coefficients in a logistic regression of a binary outcome from the same RCT, even when those added covariates are not confounders.<span><sup>3-7</sup></span> Therefore, the estimated odds ratios also change unlike the linear regression where the estimated coefficients do not change. The reason that the odds ratios change is because the estimated coefficients in a logistic regression are scaled by an arbitrary factor equal to the square root of the variance of the unexplained part of binary outcome, or <span></span><math>\n <mrow>\n <mi>σ</mi>\n </mrow></math>. That is, logistic regressions estimate <span></span><math>\n <mrow>\n <mi>β</mi>\n <mo>/</mo>\n <mi>σ</mi>\n </mrow></math>, not <span></span><math>\n <mrow>\n <mi>β</mi>\n </mrow></math> (for the full mathematical derivation, see Norton and Dowd<span><sup>3</sup></span>). Furthermore and more problematic, <span></span><math>\n <mrow>\n <mi>σ</mi>\n </mrow></math> is unknown to the researcher.</p><p>Because the estimated coefficients in a logistic regression are scaled by an arbitrary factor <span></span><math>\n <mrow>\n <mi>σ</mi>\n </mrow></math>, the odds ratios are also scaled by an arbitrary factor (odds ratio = <span></span><math>\n <mrow>\n <mi>exp</mi>\n <mfenced>\n <mrow>\n <mi>β</mi>\n <mo>/</mo>\n <mi>σ</mi>\n </mrow>\n </mfenced>\n </mrow></math>). Ideally, this arbitrary scaling factor <span></span><math>\n <mrow>\n <mi>σ</mi>\n </mrow></math> would be invariant to changes in covariate specification, but it is not. In fact, this scaling factor changes when more explanatory variables are added to the logistic regression model, because the added variables explain more of the total variation and reduce the unexplained variance and reduce <span></span><math>\n <mrow>\n <mi>σ</mi>\n </mrow></math>. Therefore, adding more independent explanatory variables to the model will increase the odds ratio of the variable of interest (e.g., treatment) due to dividing by a smaller scaling factor (<i>σ</i>), which does not occur when representing the strength of association via relative risks or absolute risks.</p><p>In the same perfectly executed RCT, including additional covariates to a logistic regression on a binary outcome would change the magnitude of the estimated treatment effect (log odds, <span></span><math>\n <mrow>\n <mi>β</mi>\n <mo>/</mo>\n <mi>σ</mi>\n </mrow></math>) and the corresponding odds ratio (<span></span><math>\n <mrow>\n <mi>exp</mi>\n <mfenced>\n <mrow>\n <mi>β</mi>\n <mo>/</mo>\n <mi>σ</mi>\n </mrow>\n </mfenced>\n </mrow></math>). As a result, the interpretation of the odds ratio depends on the covariates included in the model. A comparison of ORs from prior literature is not meaningful if either the covariate specification is different or if the sample is different because the unknown <span></span><math>\n <mrow>\n <mi>σ</mi>\n </mrow></math> is different for each study.</p><p>In the readmission example above, a clearer option would be to report marginal effects in terms of a percentage point change in the probability of readmission, along with the base readmission rate for context.<span><sup>8</sup></span></p><p>In health services research, the most common way of reporting marginal effects is through average marginal effects—the average of the marginal effects computed for each person. These are interpreted as the mean percentage point difference—<i>not</i> the percent difference—in outcome probabilities that accompany a change in the treatment variable's value. For binary treatments, an alternative is to present the predicted probabilities of the outcome when the treatment variable equals 0 and 1.</p><p>Marginal effects are much less sensitive to the unknown scaling factor and exhibit little change when independent covariates are added to the logistic regression model. When averaged, many of these small changes cancel out.<span><sup>3</sup></span> The magnitude of average marginal effects can be compared across different studies, whereas the magnitude of odds ratios cannot. For this reason, marginal effects are preferable to report from logistic regression from RCTs and nonrandomized studies.</p><p>By extension from odds ratios not being comparable across studies due to unknown scaling factors being different, they have limited utility in systematic reviews and meta-analyses. Marginal effects overcome these difficulties.</p><p>Similarly, marginal effects are preferable to odds ratios or coefficients when using logistic regression to generate predictive models that will be applied to other populations. The magnitude of the unknown scaling factor in odds ratios or log odds will differ across populations, limiting the generalizability of a predictive model to a population other than the one in which it is tested and trained.</p><p>The choice of how to report results from a logistic regression is important because logistic regression is one of the most common statistical tools in the health services research toolkit. It is also important that researchers—especially researchers who study public policies and quality of care—communicate their results and conclusions clearly to other researchers, policymakers, and the public. Therefore, <i>HSR</i>'s stand on odds ratios will help improve interpretation and communication.</p><p>We commend <i>Health Services Research</i> for deciding to discourage the reporting of odds ratios in most studies. We agree wholeheartedly with this decision, which keeps <i>Health Services Research</i> at the forefront of best practices.</p><p>Dr. Maciejewski was also supported by a Research Career Scientist award from the Department of Veterans Affairs (RCS 10-391).</p>","PeriodicalId":55065,"journal":{"name":"Health Services Research","volume":null,"pages":null},"PeriodicalIF":3.1000,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/1475-6773.14337","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Health Services Research","FirstCategoryId":"3","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1111/1475-6773.14337","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}
引用次数: 0
Abstract
Health Services Research encourages authors to report marginal effects instead of odds ratios for logistic regression with a binary outcome. Specifically, in the instructions for authors, Manuscript Formatting and Submission Requirements, section 2.4.2.2 Structured abstract and keywords, it reads “Reporting of odds ratios is discouraged (marginal effects preferred) except in case-control studies” (see the HSR website https://www.hsr.org/authors/manuscript-formatting-submission-requirements).
We applaud this decision. We also encourage other journals to make the same decision. It is time to end the reporting of odds ratios in the scientific literature for most research studies, except for case–control studies with matched samples.
HSR's decision is due to increasing recognition that odds ratios are not only confusing to non-researchers,1, 2 but that researchers themselves often misinterpret them.3, 4 Odds ratios are also of limited utility in meta-analyses. Marginal effects, which represent the difference in the probability of a binary outcome between comparison groups, are more straightforward to interpret and compare. Below, we illustrate the difficulties in interpreting odds ratios, outline the conditions that must be met for odds ratios to be compared directly, and explain how marginal effects overcome these difficulties.
Consider a hypothetical prospective cohort study of whether a new hospital-based discharge program affects the 30-day readmission rate, a binary outcome, observed for each patient who is discharged alive. The program's goal is to help eligible patients avoid unnecessary readmissions, and patients are randomized into participating in the program or not. Suppose that a carefully designed study estimates the logistic regression coefficient (the log odds) on the discharge program to be , indicating that readmission rates are lower for patients who participate in the discharge program than patients who do not. When writing about the results, the researcher must decide how to report the magnitude of the change and has several choices for how to do so.
One option is to report the odds ratio, which in this case is , and then compare it with other published odds ratios in the literature. However, this estimated odds ratio of 0.82 depends on an unobservable scaling factor that makes its interpretation conditional on the data and on the model specification.3, 5 As odds ratios are scaled by different unobservable factors and are conditional on different model specifications, the estimated odds ratio cannot be compared with any other odds ratio.6, 7 Even within a single study, odds ratios based on models including different sets of covariates cannot be compared. It would be more accurate to report that, “The estimated odds ratio is 0.82, conditional on the covariates included in the regression, but a different odds ratio would be found if the model included a different set of explanatory variables.” Due to an unobserved scaling factor that is included in every estimated odds ratio, odds ratios are not generalizable.
Odds ratios from different covariate specifications within the same study or between different studies can almost never be compared directly. The explanation for this requires an understanding of how logistic regression differs from linear regression.3 In least squares regression, adding covariates that predict the outcome—but are independent of other covariates (and are therefore not mediators or confounders)—does not change either the estimated parameters or the marginal effects. Adding more independent covariates to a linear regression just reduces the amount of unexplained variation, which reduces the error variance (), and results in smaller standard errors for each parameter or marginal effect because of improved precision. For example, in a perfectly executed randomized controlled trial (RCT), the assignment to treatment is independent of all covariates, and the covariates are balanced in the treatment and comparison groups. In a perfectly executed RCT, the estimated treatment effect from a least squares regression should be the same whether covariates are included or not. The only difference in the estimated treatment effect with or without covariate adjustment is the standard errors. Including covariates corrects for any imbalance in the covariates resulting from sampling variation. Adding covariates thus improves statistical significance while leaving the expected value of the estimated treatment effects unchanged.
This result does not carry over to logistic regression (or to probit regression). In contrast to linear regression applied to the RCT, adding covariates will change the estimated coefficients in a logistic regression of a binary outcome from the same RCT, even when those added covariates are not confounders.3-7 Therefore, the estimated odds ratios also change unlike the linear regression where the estimated coefficients do not change. The reason that the odds ratios change is because the estimated coefficients in a logistic regression are scaled by an arbitrary factor equal to the square root of the variance of the unexplained part of binary outcome, or . That is, logistic regressions estimate , not (for the full mathematical derivation, see Norton and Dowd3). Furthermore and more problematic, is unknown to the researcher.
Because the estimated coefficients in a logistic regression are scaled by an arbitrary factor , the odds ratios are also scaled by an arbitrary factor (odds ratio = ). Ideally, this arbitrary scaling factor would be invariant to changes in covariate specification, but it is not. In fact, this scaling factor changes when more explanatory variables are added to the logistic regression model, because the added variables explain more of the total variation and reduce the unexplained variance and reduce . Therefore, adding more independent explanatory variables to the model will increase the odds ratio of the variable of interest (e.g., treatment) due to dividing by a smaller scaling factor (σ), which does not occur when representing the strength of association via relative risks or absolute risks.
In the same perfectly executed RCT, including additional covariates to a logistic regression on a binary outcome would change the magnitude of the estimated treatment effect (log odds, ) and the corresponding odds ratio (). As a result, the interpretation of the odds ratio depends on the covariates included in the model. A comparison of ORs from prior literature is not meaningful if either the covariate specification is different or if the sample is different because the unknown is different for each study.
In the readmission example above, a clearer option would be to report marginal effects in terms of a percentage point change in the probability of readmission, along with the base readmission rate for context.8
In health services research, the most common way of reporting marginal effects is through average marginal effects—the average of the marginal effects computed for each person. These are interpreted as the mean percentage point difference—not the percent difference—in outcome probabilities that accompany a change in the treatment variable's value. For binary treatments, an alternative is to present the predicted probabilities of the outcome when the treatment variable equals 0 and 1.
Marginal effects are much less sensitive to the unknown scaling factor and exhibit little change when independent covariates are added to the logistic regression model. When averaged, many of these small changes cancel out.3 The magnitude of average marginal effects can be compared across different studies, whereas the magnitude of odds ratios cannot. For this reason, marginal effects are preferable to report from logistic regression from RCTs and nonrandomized studies.
By extension from odds ratios not being comparable across studies due to unknown scaling factors being different, they have limited utility in systematic reviews and meta-analyses. Marginal effects overcome these difficulties.
Similarly, marginal effects are preferable to odds ratios or coefficients when using logistic regression to generate predictive models that will be applied to other populations. The magnitude of the unknown scaling factor in odds ratios or log odds will differ across populations, limiting the generalizability of a predictive model to a population other than the one in which it is tested and trained.
The choice of how to report results from a logistic regression is important because logistic regression is one of the most common statistical tools in the health services research toolkit. It is also important that researchers—especially researchers who study public policies and quality of care—communicate their results and conclusions clearly to other researchers, policymakers, and the public. Therefore, HSR's stand on odds ratios will help improve interpretation and communication.
We commend Health Services Research for deciding to discourage the reporting of odds ratios in most studies. We agree wholeheartedly with this decision, which keeps Health Services Research at the forefront of best practices.
Dr. Maciejewski was also supported by a Research Career Scientist award from the Department of Veterans Affairs (RCS 10-391).
期刊介绍:
Health Services Research (HSR) is a peer-reviewed scholarly journal that provides researchers and public and private policymakers with the latest research findings, methods, and concepts related to the financing, organization, delivery, evaluation, and outcomes of health services. Rated as one of the top journals in the fields of health policy and services and health care administration, HSR publishes outstanding articles reporting the findings of original investigations that expand knowledge and understanding of the wide-ranging field of health care and that will help to improve the health of individuals and communities.