Background: Reliable prediction of clinical progression over time can improve the outcomes of depression. Little work has been done integrating various risk factors for depression, to determine the combinations of factors with the greatest utility for identifying which individuals are at the greatest risk.
Materials and methods: This study demonstrates that data-driven Machine Learning (ML) methods such as Random Effects/Expectation Maximization (RE-EM) trees and Mixed Effects Random Forest (MERF) can be applied to reliably identify variables that have the greatest utility for classifying subgroups at greatest risk for depression. 185 young adults completed measures of depression risk, including rumination, worry, negative cognitive styles, cognitive and coping flexibilities and negative life events, along with symptoms of depression. We trained RE-EM trees and MERF algorithms and compared them to traditional Linear Mixed Models (LMMs) predicting depressive symptoms prospectively and concurrently with cross-validation.
Results: Our results indicated that the RE-EM tree and MERF methods model complex interactions, identify subgroups of individuals and predict depression severity comparable to LMM. Further, machine learning models determined that brooding, negative life events, negative cognitive styles, and perceived control were the most relevant predictors of future depression levels.
Conclusion: Random effects machine learning models have the potential for high clinical utility and can be leveraged for interventions to reduce vulnerability to depression.