Empirical economic modelers often have to choose between two classes of models, with each class containing multiple models. In many cases, this decision is based on the predictive ability of the considered models. This entails that multiple testing and/or p-hacking pose known risks. This study presents a new statistical approach for comparing all model in a single test, serving as a multi-benchmark reality check test. The behavior of the test is studied asymptotically and in small finite samples. We show how the new approach works by analyzing whether one family of linear bivariate models outperforms a univariate family in predicting commodity prices. This paper raises new questions for future research. From an empirical viewpoint, we present several open questions in economic modeling that can be tested with multi-benchmark tests. Meanwhile, from a theoretical viewpoint, further studies can investigate whether a more general method for approximating or simulating the test distribution can be developed.