Background: A body of research suggests that many participants in alcohol use treatment research begin making changes to their drinking behavior prior to beginning treatment. Although several studies have examined factors associated with such change, results have been mixed or yielded small effects. Both theoretical and methodological limitations have hindered efforts to understand such change, highlighting the need for novel analytic approaches. This study aimed to examine how traditional modeling methods (linear and logistic regression) and machine learning (recursive partitioning, random forests, neural networks, and support vector machines) may be used to predict pretreatment drinking changes.
Methods: Using baseline psychological constructs and demographic variables of an existing dataset of 175 predominately white (93.7%) participants, we randomly split the data into a training dataset (80% of available data) and a testing dataset (20% of available data) and used the training dataset to run the initial models. Then, using the remaining test data, we used the trained models to make predictions on new data. We ran models predicting the percent change in drinking and heavy drinking days, and classification models on whether participants were pretreatment changers (50% reduction in drinking days; 70% reduction in percent heavy drinking days).
Results: Overall, the neural-network models tended to have the highest predictive accuracy, although the model areas under the curves ranged from poor to acceptable. Variable importance algorithms indicated that demographic factors (e.g., education and income) and psychological constructs (e.g., processes of change) were among the predictors that contributed most to improved model performance.
Conclusions: The identification of demographic variables as important predictors highlights the importance of understanding demographic and societal-level factors as potential drivers of pretreatment drinking changes.