{"title":"Association Is Not Prediction—A Pervasive Issue in the Medical Literature","authors":"Kaitlin Stangroome, Michael R. Perkin","doi":"10.1111/cea.70040","DOIUrl":null,"url":null,"abstract":"<p>To the Editor,</p><p>It has recently been observed by Varga et al. that the available scientific literature ‘demonstrates a common tendency to claim predictive value from studies aimed at determining associations’ [<span>1</span>]. While research studies often uncover numerous associations, associations themselves do not convey predictive value and ‘confusion between association and prediction harms clinicians, scientists, and ultimately, the patients’ [<span>1</span>]. Varga et al. systematically reviewed published papers in the field of diabetes epidemiology that made claims about prediction in their title and assessed whether they reported findings with proper relevant measures of prediction in their abstract [<span>1</span>]. The purpose of their research was to identify titles where certain biomarkers were evaluated for their predictive ability in relation to diabetes <i>as an outcome</i>, or where diabetes would predict <i>another outcome</i> (e.g., disease progression, response to treatment or the development of complications) [<span>1</span>]. They found that 61% of papers referring to prediction in their titles did not report metrics of predictive statistics in their abstracts [<span>1</span>]. In a simulation data exercise they demonstrated that biomarkers with large effect sizes and statistically significant <i>p</i> values can still offer poor discriminative utility [<span>1</span>]. We undertook a systematic review to see if similar results are seen in the field of allergy epidemiology, acknowledging that disease prediction is a much more established area of research for diabetes than it is for allergy. We used the same categorisation of statistical methodologies as Varga et al. [<span>1</span>] to identify measures of prediction and measures of association.</p><p>A PubMed search was conducted on the 25th of May 2023 to identify studies published between February 1981 and May 2023 that met the search criteria. The aim was to find studies which contained prediction in their titles alongside an allergy descriptor. The search retrieved 1003 titles, of which 173 were subsequently excluded, leaving 830 abstracts and titles. Abstracts were then divided according to content. The 830 abstracts were categorised by allergic condition and area of investigation.</p><p>Using the criteria specified in the Varga et al. paper, we identified those with prediction metrics and then those with association metrics. Those containing neither measures of prediction nor association were designated as undefined. Overall, only 39% of the studies (323/830) reported prediction metrics in their abstracts. The remaining 61% (507/830) were divided between 38% (317/830) which reported methods of association and 23% (190/830) that did not report a clear methodology (undefined). Only 17% of the studies reported sensitivity and specificity in their abstracts, with 142/830 reporting sensitivity and 139/830 reporting specificity.</p><p>The distribution of allergic conditions across our categorisation of statistical methodologies used (prediction metrics, undefined, association metrics) was broadly similar. However, the areas of investigation did differ. A larger proportion of abstracts with prediction metrics focused on modelling and diagnosis, compared with those without prediction metrics. In contrast, a larger proportion of papers utilising association metrics focused on risk factors compared to papers using prediction metrics.</p><p>Amongst prediction metrics, the most common metric was ROC AUC (Receiver Operating Characteristics Area Under the Curve) with 165 mentions (Figure 1). There were 26 different prediction metrics used (range 1–165 abstracts). For association metrics, the most common metric was logistic regression with 147 mentions, out of 17 different methods of association used (range 1–147 abstracts) (Figure 1).</p><p>In conclusion, most studies in the field of allergy that claim to show prediction provide no evidence of prediction metrics within their abstracts. Abstracts were more likely to include predictive metrics where the subject area was modelling or diagnosis. Where risk factors have been identified as associations in, for example, a cohort study, and predictive ability has been inferred, the use of the term prediction has the potential to have been used erroneously.</p><p>The strength of our approach was using a comprehensive search strategy previously used across another field of medicine. The principal limitation is that our research was unfunded, and we were not able to undertake a full text analysis or any form of sensitivity analysis by underlying condition or investigation. However, Varga et al. did undertake a full text analysis of 100 papers they had identified and which had not reported metrics of prediction in their abstract and found that only 15 did contain metrics of prediction in the full text [<span>1</span>]. The concern that the distinction between measures of association and prediction is not appreciated has been raised by others [<span>2-4</span>]. The 39% figure for the allergy papers analysed providing metrics of prediction is identical to the 39% figure observed in the field of diabetes [<span>1</span>]. Additionally, even fewer of the papers reported sensitivity and specificity metrics. This consistency suggests that this is likely to be a pervasive issue across medical disciplines.</p><p>Many of these papers promise improvements in patient care and this could lead to inflation of the clinical importance of their findings. Patient care can potentially be compromised by statistically untrained clinicians incorrectly inferring clinical prediction when such evidence is absent. If prediction is being used appropriately then sensitivity and specificity are pre-requisites in the context of the ability of a measurement to predict an outcome and their presentation facilitates comparison between diagnostic tools [<span>5, 6</span>]. Tighter statistical scrutiny should be advocated to avoid similar problems in the future and EQUATOR reporting guidelines such as TRIPOD (Transparent Reporting of a multivariable prediction model for Individual Prognosis or Diagnosis) should be used where appropriate [<span>7</span>].</p><p>M.R.P. contributed the original idea. K.S. and M.R.P. contributed to the study design. K.S. conducted the data search, analysed the data, prepared study results, and drafted the manuscript. K.S. and M.R.P. contributed to revising the manuscript and approved the final version.</p><p>The authors declare no conflicts of interest.</p>","PeriodicalId":10207,"journal":{"name":"Clinical and Experimental Allergy","volume":"55 7","pages":"583-585"},"PeriodicalIF":5.2000,"publicationDate":"2025-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/cea.70040","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Clinical and Experimental Allergy","FirstCategoryId":"3","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1111/cea.70040","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ALLERGY","Score":null,"Total":0}
引用次数: 0
Abstract
To the Editor,
It has recently been observed by Varga et al. that the available scientific literature ‘demonstrates a common tendency to claim predictive value from studies aimed at determining associations’ [1]. While research studies often uncover numerous associations, associations themselves do not convey predictive value and ‘confusion between association and prediction harms clinicians, scientists, and ultimately, the patients’ [1]. Varga et al. systematically reviewed published papers in the field of diabetes epidemiology that made claims about prediction in their title and assessed whether they reported findings with proper relevant measures of prediction in their abstract [1]. The purpose of their research was to identify titles where certain biomarkers were evaluated for their predictive ability in relation to diabetes as an outcome, or where diabetes would predict another outcome (e.g., disease progression, response to treatment or the development of complications) [1]. They found that 61% of papers referring to prediction in their titles did not report metrics of predictive statistics in their abstracts [1]. In a simulation data exercise they demonstrated that biomarkers with large effect sizes and statistically significant p values can still offer poor discriminative utility [1]. We undertook a systematic review to see if similar results are seen in the field of allergy epidemiology, acknowledging that disease prediction is a much more established area of research for diabetes than it is for allergy. We used the same categorisation of statistical methodologies as Varga et al. [1] to identify measures of prediction and measures of association.
A PubMed search was conducted on the 25th of May 2023 to identify studies published between February 1981 and May 2023 that met the search criteria. The aim was to find studies which contained prediction in their titles alongside an allergy descriptor. The search retrieved 1003 titles, of which 173 were subsequently excluded, leaving 830 abstracts and titles. Abstracts were then divided according to content. The 830 abstracts were categorised by allergic condition and area of investigation.
Using the criteria specified in the Varga et al. paper, we identified those with prediction metrics and then those with association metrics. Those containing neither measures of prediction nor association were designated as undefined. Overall, only 39% of the studies (323/830) reported prediction metrics in their abstracts. The remaining 61% (507/830) were divided between 38% (317/830) which reported methods of association and 23% (190/830) that did not report a clear methodology (undefined). Only 17% of the studies reported sensitivity and specificity in their abstracts, with 142/830 reporting sensitivity and 139/830 reporting specificity.
The distribution of allergic conditions across our categorisation of statistical methodologies used (prediction metrics, undefined, association metrics) was broadly similar. However, the areas of investigation did differ. A larger proportion of abstracts with prediction metrics focused on modelling and diagnosis, compared with those without prediction metrics. In contrast, a larger proportion of papers utilising association metrics focused on risk factors compared to papers using prediction metrics.
Amongst prediction metrics, the most common metric was ROC AUC (Receiver Operating Characteristics Area Under the Curve) with 165 mentions (Figure 1). There were 26 different prediction metrics used (range 1–165 abstracts). For association metrics, the most common metric was logistic regression with 147 mentions, out of 17 different methods of association used (range 1–147 abstracts) (Figure 1).
In conclusion, most studies in the field of allergy that claim to show prediction provide no evidence of prediction metrics within their abstracts. Abstracts were more likely to include predictive metrics where the subject area was modelling or diagnosis. Where risk factors have been identified as associations in, for example, a cohort study, and predictive ability has been inferred, the use of the term prediction has the potential to have been used erroneously.
The strength of our approach was using a comprehensive search strategy previously used across another field of medicine. The principal limitation is that our research was unfunded, and we were not able to undertake a full text analysis or any form of sensitivity analysis by underlying condition or investigation. However, Varga et al. did undertake a full text analysis of 100 papers they had identified and which had not reported metrics of prediction in their abstract and found that only 15 did contain metrics of prediction in the full text [1]. The concern that the distinction between measures of association and prediction is not appreciated has been raised by others [2-4]. The 39% figure for the allergy papers analysed providing metrics of prediction is identical to the 39% figure observed in the field of diabetes [1]. Additionally, even fewer of the papers reported sensitivity and specificity metrics. This consistency suggests that this is likely to be a pervasive issue across medical disciplines.
Many of these papers promise improvements in patient care and this could lead to inflation of the clinical importance of their findings. Patient care can potentially be compromised by statistically untrained clinicians incorrectly inferring clinical prediction when such evidence is absent. If prediction is being used appropriately then sensitivity and specificity are pre-requisites in the context of the ability of a measurement to predict an outcome and their presentation facilitates comparison between diagnostic tools [5, 6]. Tighter statistical scrutiny should be advocated to avoid similar problems in the future and EQUATOR reporting guidelines such as TRIPOD (Transparent Reporting of a multivariable prediction model for Individual Prognosis or Diagnosis) should be used where appropriate [7].
M.R.P. contributed the original idea. K.S. and M.R.P. contributed to the study design. K.S. conducted the data search, analysed the data, prepared study results, and drafted the manuscript. K.S. and M.R.P. contributed to revising the manuscript and approved the final version.
期刊介绍:
Clinical & Experimental Allergy strikes an excellent balance between clinical and scientific articles and carries regular reviews and editorials written by leading authorities in their field.
In response to the increasing number of quality submissions, since 1996 the journals size has increased by over 30%. Clinical & Experimental Allergy is essential reading for allergy practitioners and research scientists with an interest in allergic diseases and mechanisms. Truly international in appeal, Clinical & Experimental Allergy publishes clinical and experimental observations in disease in all fields of medicine in which allergic hypersensitivity plays a part.