Background
The term population is frequently used in clinical research and statistics, but concepts are multiple and confusing. Populations are a roundabout way of conceiving classifications, generalizations and inductive inferences. When misapplied, the term can lead to serious errors in study design, analysis and interpretation.
Methods
We review various notions of populations, their relationship with statistical inferences, and whether they refer to persons, variables or theoretical constructions.
Results
There are design- and model-based statistical inferences. The simplest design-based inference is from a representative random sample to a real definite population, but it is rarely possible or even pertinent in clinical research. The term population rarely concerns patients. Super-populations are theoretical postulates of statistical models that attempt to explain the distributions and relationships of variables. Pseudo-populations are mathematical constructs used to balance baseline characteristics to extract causal inferences from observational studies. Statistical populations are as numerous as variables. This leads to an explosion of entities, with much room for divergent analyses and manipulations. Target populations are to whom study results should apply. In the absence of a real population, they are erroneously assimilated to the eligibility criteria of study subjects. The inductive problem remains unsolved, for inferences from study subjects to future patients then depend on the meaning of words used in indefinite descriptions.
Conclusion
The term population often hides more than it reveals regarding problems of generalizations and inferences. Because the term leads to errors and misconceptions, it should rarely be used in clinical research.