Misinterpretation of null-hypothesis tests (p-values) and confidence intervals has been a longstanding issue in epidemiology. Despite efforts by leading journals to discourage or ban such practices, the extent of misinterpretations in modern epidemiologic literature remains unclear. We examined papers published in 2022 in three leading epidemiology journals (International Journal of Epidemiology, Epidemiology, and American Journal of Epidemiology) to assess the frequency and types of misinterpretations of p-values and confidence intervals. We randomly sampled 64 papers that assessed exposure-outcome relationships. Two authors independently reviewed the selected papers, cataloging misinterpretations according to guidelines published in 2016. While concerns about p-value misuse persist in scientific literature, our review of recent epidemiological studies reveals encouraging progress: outright statistical misinterpretations were not observed in the leading journals. We identified subtle opportunities to enhance reporting, including reducing reliance on binary “significant” vs. “non-significant” language, more consistently pairing p-values with effect sizes, and fuller interpretations of confidence intervals. In a sense, our concerns relate to the suitability of null hypothesis testing framework in epidemiology, rather than its correct application. Notably, we highlight examples of commendable practices where studies successfully integrated statistical results with clinical and public health context. Modern epidemiological research shows improved statistical reporting, while some concerns persist. Importantly, the findings of this review apply only to the primary results as reported in published manuscripts and do not extend to the broader analytic process that generates those results. Such assumptions are not secondary to hypothesis testing; rather, they contribute as much to the resulting p-value as the target hypothesis itself and overlooking them can lead to overly optimistic interpretations. Recognizing this distinction is essential for contextualizing our conclusions and for situating p-values and confidence intervals within the broader inferential framework. We recommend targeted refinements: avoiding binary language, mandating effect size reporting, and developing methods to interpret confidence intervals beyond null-hypothesis testing. These steps will align the field with evolving standards while preserving the utility of p-values where appropriate.
扫码关注我们
求助内容:
应助结果提醒方式:
