In this commentary, we elucidate three indispensable evaluation steps toward the real-world deployment of machine learning within the healthcare sector and demonstrate referable examples for diagnostic, therapeutic, and prognostic tasks. We encourage researchers to move beyond retrospective and within-sample validation, and step into the practical implementation at the bedside rather than leaving developed machine learning models in the dust of archived literature.