Association Is Not Prediction—A Pervasive Issue in the Medical Literature

IF 5.2 2区医学 Q1 ALLERGY Clinical and Experimental Allergy Pub Date : 2025-03-26 DOI:10.1111/cea.70040

Kaitlin Stangroome, Michael R. Perkin

{"title":"Association Is Not Prediction—A Pervasive Issue in the Medical Literature","authors":"Kaitlin Stangroome, Michael R. Perkin","doi":"10.1111/cea.70040","DOIUrl":null,"url":null,"abstract":"To the Editor,It has recently been observed by Varga et al. that the available scientific literature ‘demonstrates a common tendency to claim predictive value from studies aimed at determining associations’ [1]. While research studies often uncover numerous associations, associations themselves do not convey predictive value and ‘confusion between association and prediction harms clinicians, scientists, and ultimately, the patients’ [1]. Varga et al. systematically reviewed published papers in the field of diabetes epidemiology that made claims about prediction in their title and assessed whether they reported findings with proper relevant measures of prediction in their abstract [1]. The purpose of their research was to identify titles where certain biomarkers were evaluated for their predictive ability in relation to diabetes as an outcome, or where diabetes would predict another outcome (e.g., disease progression, response to treatment or the development of complications) [1]. They found that 61% of papers referring to prediction in their titles did not report metrics of predictive statistics in their abstracts [1]. In a simulation data exercise they demonstrated that biomarkers with large effect sizes and statistically significant p values can still offer poor discriminative utility [1]. We undertook a systematic review to see if similar results are seen in the field of allergy epidemiology, acknowledging that disease prediction is a much more established area of research for diabetes than it is for allergy. We used the same categorisation of statistical methodologies as Varga et al. [1] to identify measures of prediction and measures of association.A PubMed search was conducted on the 25th of May 2023 to identify studies published between February 1981 and May 2023 that met the search criteria. The aim was to find studies which contained prediction in their titles alongside an allergy descriptor. The search retrieved 1003 titles, of which 173 were subsequently excluded, leaving 830 abstracts and titles. Abstracts were then divided according to content. The 830 abstracts were categorised by allergic condition and area of investigation.Using the criteria specified in the Varga et al. paper, we identified those with prediction metrics and then those with association metrics. Those containing neither measures of prediction nor association were designated as undefined. Overall, only 39% of the studies (323/830) reported prediction metrics in their abstracts. The remaining 61% (507/830) were divided between 38% (317/830) which reported methods of association and 23% (190/830) that did not report a clear methodology (undefined). Only 17% of the studies reported sensitivity and specificity in their abstracts, with 142/830 reporting sensitivity and 139/830 reporting specificity.The distribution of allergic conditions across our categorisation of statistical methodologies used (prediction metrics, undefined, association metrics) was broadly similar. However, the areas of investigation did differ. A larger proportion of abstracts with prediction metrics focused on modelling and diagnosis, compared with those without prediction metrics. In contrast, a larger proportion of papers utilising association metrics focused on risk factors compared to papers using prediction metrics.Amongst prediction metrics, the most common metric was ROC AUC (Receiver Operating Characteristics Area Under the Curve) with 165 mentions (Figure 1). There were 26 different prediction metrics used (range 1–165 abstracts). For association metrics, the most common metric was logistic regression with 147 mentions, out of 17 different methods of association used (range 1–147 abstracts) (Figure 1).In conclusion, most studies in the field of allergy that claim to show prediction provide no evidence of prediction metrics within their abstracts. Abstracts were more likely to include predictive metrics where the subject area was modelling or diagnosis. Where risk factors have been identified as associations in, for example, a cohort study, and predictive ability has been inferred, the use of the term prediction has the potential to have been used erroneously.The strength of our approach was using a comprehensive search strategy previously used across another field of medicine. The principal limitation is that our research was unfunded, and we were not able to undertake a full text analysis or any form of sensitivity analysis by underlying condition or investigation. However, Varga et al. did undertake a full text analysis of 100 papers they had identified and which had not reported metrics of prediction in their abstract and found that only 15 did contain metrics of prediction in the full text [1]. The concern that the distinction between measures of association and prediction is not appreciated has been raised by others [2-4]. The 39% figure for the allergy papers analysed providing metrics of prediction is identical to the 39% figure observed in the field of diabetes [1]. Additionally, even fewer of the papers reported sensitivity and specificity metrics. This consistency suggests that this is likely to be a pervasive issue across medical disciplines.Many of these papers promise improvements in patient care and this could lead to inflation of the clinical importance of their findings. Patient care can potentially be compromised by statistically untrained clinicians incorrectly inferring clinical prediction when such evidence is absent. If prediction is being used appropriately then sensitivity and specificity are pre-requisites in the context of the ability of a measurement to predict an outcome and their presentation facilitates comparison between diagnostic tools [5, 6]. Tighter statistical scrutiny should be advocated to avoid similar problems in the future and EQUATOR reporting guidelines such as TRIPOD (Transparent Reporting of a multivariable prediction model for Individual Prognosis or Diagnosis) should be used where appropriate [7].M.R.P. contributed the original idea. K.S. and M.R.P. contributed to the study design. K.S. conducted the data search, analysed the data, prepared study results, and drafted the manuscript. K.S. and M.R.P. contributed to revising the manuscript and approved the final version.The authors declare no conflicts of interest.","PeriodicalId":10207,"journal":{"name":"Clinical and Experimental Allergy","volume":"55 7","pages":"583-585"},"PeriodicalIF":5.2000,"publicationDate":"2025-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/cea.70040","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Clinical and Experimental Allergy","FirstCategoryId":"3","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1111/cea.70040","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ALLERGY","Score":null,"Total":0}

引用次数: 0

Abstract

To the Editor,

It has recently been observed by Varga et al. that the available scientific literature ‘demonstrates a common tendency to claim predictive value from studies aimed at determining associations’ [1]. While research studies often uncover numerous associations, associations themselves do not convey predictive value and ‘confusion between association and prediction harms clinicians, scientists, and ultimately, the patients’ [1]. Varga et al. systematically reviewed published papers in the field of diabetes epidemiology that made claims about prediction in their title and assessed whether they reported findings with proper relevant measures of prediction in their abstract [1]. The purpose of their research was to identify titles where certain biomarkers were evaluated for their predictive ability in relation to diabetes as an outcome, or where diabetes would predict another outcome (e.g., disease progression, response to treatment or the development of complications) [1]. They found that 61% of papers referring to prediction in their titles did not report metrics of predictive statistics in their abstracts [1]. In a simulation data exercise they demonstrated that biomarkers with large effect sizes and statistically significant p values can still offer poor discriminative utility [1]. We undertook a systematic review to see if similar results are seen in the field of allergy epidemiology, acknowledging that disease prediction is a much more established area of research for diabetes than it is for allergy. We used the same categorisation of statistical methodologies as Varga et al. [1] to identify measures of prediction and measures of association.

A PubMed search was conducted on the 25th of May 2023 to identify studies published between February 1981 and May 2023 that met the search criteria. The aim was to find studies which contained prediction in their titles alongside an allergy descriptor. The search retrieved 1003 titles, of which 173 were subsequently excluded, leaving 830 abstracts and titles. Abstracts were then divided according to content. The 830 abstracts were categorised by allergic condition and area of investigation.

Using the criteria specified in the Varga et al. paper, we identified those with prediction metrics and then those with association metrics. Those containing neither measures of prediction nor association were designated as undefined. Overall, only 39% of the studies (323/830) reported prediction metrics in their abstracts. The remaining 61% (507/830) were divided between 38% (317/830) which reported methods of association and 23% (190/830) that did not report a clear methodology (undefined). Only 17% of the studies reported sensitivity and specificity in their abstracts, with 142/830 reporting sensitivity and 139/830 reporting specificity.

The distribution of allergic conditions across our categorisation of statistical methodologies used (prediction metrics, undefined, association metrics) was broadly similar. However, the areas of investigation did differ. A larger proportion of abstracts with prediction metrics focused on modelling and diagnosis, compared with those without prediction metrics. In contrast, a larger proportion of papers utilising association metrics focused on risk factors compared to papers using prediction metrics.

Amongst prediction metrics, the most common metric was ROC AUC (Receiver Operating Characteristics Area Under the Curve) with 165 mentions (Figure 1). There were 26 different prediction metrics used (range 1–165 abstracts). For association metrics, the most common metric was logistic regression with 147 mentions, out of 17 different methods of association used (range 1–147 abstracts) (Figure 1).

In conclusion, most studies in the field of allergy that claim to show prediction provide no evidence of prediction metrics within their abstracts. Abstracts were more likely to include predictive metrics where the subject area was modelling or diagnosis. Where risk factors have been identified as associations in, for example, a cohort study, and predictive ability has been inferred, the use of the term prediction has the potential to have been used erroneously.

The strength of our approach was using a comprehensive search strategy previously used across another field of medicine. The principal limitation is that our research was unfunded, and we were not able to undertake a full text analysis or any form of sensitivity analysis by underlying condition or investigation. However, Varga et al. did undertake a full text analysis of 100 papers they had identified and which had not reported metrics of prediction in their abstract and found that only 15 did contain metrics of prediction in the full text [1]. The concern that the distinction between measures of association and prediction is not appreciated has been raised by others [2-4]. The 39% figure for the allergy papers analysed providing metrics of prediction is identical to the 39% figure observed in the field of diabetes [1]. Additionally, even fewer of the papers reported sensitivity and specificity metrics. This consistency suggests that this is likely to be a pervasive issue across medical disciplines.

Many of these papers promise improvements in patient care and this could lead to inflation of the clinical importance of their findings. Patient care can potentially be compromised by statistically untrained clinicians incorrectly inferring clinical prediction when such evidence is absent. If prediction is being used appropriately then sensitivity and specificity are pre-requisites in the context of the ability of a measurement to predict an outcome and their presentation facilitates comparison between diagnostic tools [5, 6]. Tighter statistical scrutiny should be advocated to avoid similar problems in the future and EQUATOR reporting guidelines such as TRIPOD (Transparent Reporting of a multivariable prediction model for Individual Prognosis or Diagnosis) should be used where appropriate [7].

M.R.P. contributed the original idea. K.S. and M.R.P. contributed to the study design. K.S. conducted the data search, analysed the data, prepared study results, and drafted the manuscript. K.S. and M.R.P. contributed to revising the manuscript and approved the final version.

The authors declare no conflicts of interest.

Abstract Image

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

关联不是预测——医学文献中普遍存在的问题。

提供预测指标的过敏论文中39%的数据与糖尿病领域中39%的数据相同。此外，更少的论文报告了敏感性和特异性指标。这种一致性表明，这可能是一个在医学学科中普遍存在的问题。许多这样的论文承诺改善病人的护理，这可能导致他们的发现的临床重要性膨胀。在缺乏证据的情况下，未经统计学训练的临床医生错误地推断临床预测，可能会损害患者护理。如果预测使用得当，那么敏感性和特异性是测量预测结果能力的先决条件，它们的呈现有助于诊断工具之间的比较[5,6]。应提倡更严格的统计审查，以避免今后出现类似问题，并应在适当情况下使用诸如TRIPOD（透明报告个体预后或诊断的多变量预测模型）之类的EQUATOR报告指南。贡献了最初的想法。K.S.和M.R.P.对研究设计有贡献。K.S.进行数据检索，分析数据，准备研究结果，并起草手稿。K.S.和M.R.P.参与了手稿的修改，并批准了最终版本。作者声明无利益冲突。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Clinical and Experimental Allergy 医学-过敏

CiteScore

10.40

自引率

9.80%

发文量

189

审稿时长

3-8 weeks

期刊介绍： Clinical & Experimental Allergy strikes an excellent balance between clinical and scientific articles and carries regular reviews and editorials written by leading authorities in their field. In response to the increasing number of quality submissions, since 1996 the journals size has increased by over 30%. Clinical & Experimental Allergy is essential reading for allergy practitioners and research scientists with an interest in allergic diseases and mechanisms. Truly international in appeal, Clinical & Experimental Allergy publishes clinical and experimental observations in disease in all fields of medicine in which allergic hypersensitivity plays a part.