{"title":"Data and decision making – from odd to artificial","authors":"L. Marais","doi":"10.17159/2309-8309/2022/v21n2a0","DOIUrl":null,"url":null,"abstract":"With my term as Editor-in-Chief of the SAOJ coming to an end soon, I cannot help but reflect on some of my past experiences in this role. Perhaps the most challenging (and satisfying) was the need to get to grips with some of the more intricate aspects of research methodology and statistics. At first glance, these concepts seem fairly straightforward, but almost ubiquitously become exceedingly complex the harder you look. The odds ratio (OR) is an excellent case in point. There are a number of ways in which the measure of association between an exposure and an outcome can be expressed. ORs are probably the most commonly used. The current emphasis on reporting 95% confidence intervals (CI), rather than only p-values, has resulted in us seeing and doing a lot more logistic regression. Along with the 95% CI, the statistical program also provides the OR, which is then reported in our results. Now, ORs are tricky things. To justify this statement, I am going to have to go way back to the start, where all good research should start, with the definitions. A ratio is simply a number obtained by dividing one number by another number, and there is not necessarily a relationship between the numerator and denominator. A proportion is a ratio that relates a part to a whole, thus there is a relationship between the numerator and denominator. Rate is a proportion where the denominator also takes into account another dimension, typically time. Defining probability (P) is a minefield, but for our purposes, we will limit it to the measure of the likelihood that an event will occur. With the basics out of the way, let us delve a little deeper. Relative risk (RR), also known as the risk ratio, is a descriptive statistic commonly used in analytical studies. Risk can be defined as the probability of the outcome of interest occurring. RR is therefore essentially a ratio of proportions. In statistical terms, RR is equal to the event rate in the exposed group divided by the event rate in the non-exposed (control) group (Figure 1). For example, imagine we are performing a study comparing the risk of developing infection following grade III open fractures when antibiotics are given within an hour of the injury (treatment group) or not (control group). If 5 out of 100 patients in the treatment group and 20 out of 100 patients in the control group get an infection, we have a relative risk of 0.25. RR = 0.25 means exposed patients (i.e., in the treatment group) are 0.25 times as likely to develop the outcome of interest. We could also state that patients receiving antibiotics within an hour were 75% (0.75 = 1 − 0.25) less likely to develop infection. As clinicians we generally prefer to think in terms of probabilities and relative risk. The other commonly used descriptive statistic to report measure of association is the odds ratio (OR). Odds can be defined as the relative probability of the outcome of interest occurring. So, what is this probability relative to? – the probability of outcome not occurring. In other words, odds represent the ratio of the probability of the event occurring over the probability of the event not occurring. Odds can mathematically be defined as equal to (P/1−P). The OR then is a ratio of ratios and equal to odds of outcome in the exposed group divided by odds of outcome in the non-exposed control group. An OR < 1 means a reduced odds of the outcome of interest occurring while an OR > 1 implies increased odds. Thus, in our open fracture example study, the OR would be 0.21. This would mean that the odds (not risk) of infection is 79% lower in the group that received antibiotics. If an OR is 3.8, that would mean that odds of the outcome of interest occurring was increased by 3.8 times. For the sake of completeness, I will also mention number needed to treat (NNT), which is essentially the number of patients that need to receive the exposure to prevent one unwanted outcome. It is defined as the inverse of the absolute risk reduction (ARR). ARR is equal to event rate in the control group (CER) minus the event rate in the exposed group (EER). At this point, it might be useful to reflect on the origin of ORs. The first rationale has to do with study design. In cross-sectional studies, the RR can be calculated from the prevalence. In cohort studies RR can be calculated from the incidence. If the incidence or prevalence is not available in case-control studies, then OR may be the only option to provide an indication of the measure of association.1 It is important to remember that case-control studies are typically used to study rare diseases or events. Why this is relevant, will hopefully make more sense shortly. The second reason for ORs’ existence is statistical in nature and somewhat more complex. Basically, logistic regression provides an OR rather than RR, even in a cohort study, because of the frequency of convergence problems during the mathematical modelling.2 What is a convergence problem? The explanation is beyond the scope of this piece, and my understanding. It has something to do with the fact that regression aims to maximise the likelihood (by finding","PeriodicalId":32220,"journal":{"name":"SA Orthopaedic Journal","volume":"1 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"SA Orthopaedic Journal","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.17159/2309-8309/2022/v21n2a0","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"Medicine","Score":null,"Total":0}
引用次数: 0
Abstract
With my term as Editor-in-Chief of the SAOJ coming to an end soon, I cannot help but reflect on some of my past experiences in this role. Perhaps the most challenging (and satisfying) was the need to get to grips with some of the more intricate aspects of research methodology and statistics. At first glance, these concepts seem fairly straightforward, but almost ubiquitously become exceedingly complex the harder you look. The odds ratio (OR) is an excellent case in point. There are a number of ways in which the measure of association between an exposure and an outcome can be expressed. ORs are probably the most commonly used. The current emphasis on reporting 95% confidence intervals (CI), rather than only p-values, has resulted in us seeing and doing a lot more logistic regression. Along with the 95% CI, the statistical program also provides the OR, which is then reported in our results. Now, ORs are tricky things. To justify this statement, I am going to have to go way back to the start, where all good research should start, with the definitions. A ratio is simply a number obtained by dividing one number by another number, and there is not necessarily a relationship between the numerator and denominator. A proportion is a ratio that relates a part to a whole, thus there is a relationship between the numerator and denominator. Rate is a proportion where the denominator also takes into account another dimension, typically time. Defining probability (P) is a minefield, but for our purposes, we will limit it to the measure of the likelihood that an event will occur. With the basics out of the way, let us delve a little deeper. Relative risk (RR), also known as the risk ratio, is a descriptive statistic commonly used in analytical studies. Risk can be defined as the probability of the outcome of interest occurring. RR is therefore essentially a ratio of proportions. In statistical terms, RR is equal to the event rate in the exposed group divided by the event rate in the non-exposed (control) group (Figure 1). For example, imagine we are performing a study comparing the risk of developing infection following grade III open fractures when antibiotics are given within an hour of the injury (treatment group) or not (control group). If 5 out of 100 patients in the treatment group and 20 out of 100 patients in the control group get an infection, we have a relative risk of 0.25. RR = 0.25 means exposed patients (i.e., in the treatment group) are 0.25 times as likely to develop the outcome of interest. We could also state that patients receiving antibiotics within an hour were 75% (0.75 = 1 − 0.25) less likely to develop infection. As clinicians we generally prefer to think in terms of probabilities and relative risk. The other commonly used descriptive statistic to report measure of association is the odds ratio (OR). Odds can be defined as the relative probability of the outcome of interest occurring. So, what is this probability relative to? – the probability of outcome not occurring. In other words, odds represent the ratio of the probability of the event occurring over the probability of the event not occurring. Odds can mathematically be defined as equal to (P/1−P). The OR then is a ratio of ratios and equal to odds of outcome in the exposed group divided by odds of outcome in the non-exposed control group. An OR < 1 means a reduced odds of the outcome of interest occurring while an OR > 1 implies increased odds. Thus, in our open fracture example study, the OR would be 0.21. This would mean that the odds (not risk) of infection is 79% lower in the group that received antibiotics. If an OR is 3.8, that would mean that odds of the outcome of interest occurring was increased by 3.8 times. For the sake of completeness, I will also mention number needed to treat (NNT), which is essentially the number of patients that need to receive the exposure to prevent one unwanted outcome. It is defined as the inverse of the absolute risk reduction (ARR). ARR is equal to event rate in the control group (CER) minus the event rate in the exposed group (EER). At this point, it might be useful to reflect on the origin of ORs. The first rationale has to do with study design. In cross-sectional studies, the RR can be calculated from the prevalence. In cohort studies RR can be calculated from the incidence. If the incidence or prevalence is not available in case-control studies, then OR may be the only option to provide an indication of the measure of association.1 It is important to remember that case-control studies are typically used to study rare diseases or events. Why this is relevant, will hopefully make more sense shortly. The second reason for ORs’ existence is statistical in nature and somewhat more complex. Basically, logistic regression provides an OR rather than RR, even in a cohort study, because of the frequency of convergence problems during the mathematical modelling.2 What is a convergence problem? The explanation is beyond the scope of this piece, and my understanding. It has something to do with the fact that regression aims to maximise the likelihood (by finding