Jae Won Suh, Rob Saunders, Elizabeth Simes, Henry Delamain, Stephen Butler, David Cottrell, Abdullah Kraam, Stephen Scott, Ian M Goodyer, James Wason, Stephen Pilling, Peter Fonagy
{"title":"预测有反社会行为的青少年的刑事犯罪:利用多系统疗法大型随机对照试验数据进行的机器学习研究。","authors":"Jae Won Suh, Rob Saunders, Elizabeth Simes, Henry Delamain, Stephen Butler, David Cottrell, Abdullah Kraam, Stephen Scott, Ian M Goodyer, James Wason, Stephen Pilling, Peter Fonagy","doi":"10.1007/s00787-024-02592-7","DOIUrl":null,"url":null,"abstract":"<p><strong>Introduction: </strong>Accurate prediction of short-term offending in young people exhibiting antisocial behaviour could support targeted interventions. Here we develop a set of machine learning (ML) models that predict offending status with good accuracy; furthermore, we show interpretable ML analyses can complement models to inform clinical decision-making.</p><p><strong>Methods: </strong>This study included 679 individuals aged 11-17 years who displayed moderate-to-severe antisocial behaviour, from a controlled trial of Multisystemic-therapy in England. The outcome was any criminal offence in the 18 months after study baseline. Four types of ML algorithms were trained: logistic regression, elastic net regression, random forest, and gradient boosting machine (GBM). Prediction models were developed (1) using predictors readily available to clinicians (e.g. sociodemographics, previous convictions), and (2) with additional information (e.g. parenting). Model agnostic feature importance values were calculated and the most important predictors identified. Nested cross-validation with 100 iterations of random data splits and 10-fold cross-validation within each iteration was employed, and the average predictive performance was reported.</p><p><strong>Results: </strong>Among the ML models using readily available predictors, the GBM is the strongest model (AUC 0.85, 95% CI 0.85-0.86); the other models have average AUCs of 0.82. This performance was better than using only the total number of previous offences as the predictor (0.67, 0.66-0.68), and the model simply assuming past offending status as the prediction (0.81, 0.80-0.81). Additional predictors slightly increased the performance of logistic regression and random forest models but decreased the performance of elastic net regression and gradient boosting machine-based models.</p><p><strong>Conclusion: </strong>The potential utility of ML approaches for accurately predicting criminal offences in high-risk youth is demonstrated. Interpretable ML-based predictive models could be utilised in youth services or research to help develop and deliver effective interventions.</p>","PeriodicalId":11856,"journal":{"name":"European Child & Adolescent Psychiatry","volume":null,"pages":null},"PeriodicalIF":6.0000,"publicationDate":"2024-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Predicting criminal offence in adolescents who exhibit antisocial behaviour: a machine learning study using data from a large randomised controlled trial of multisystemic therapy.\",\"authors\":\"Jae Won Suh, Rob Saunders, Elizabeth Simes, Henry Delamain, Stephen Butler, David Cottrell, Abdullah Kraam, Stephen Scott, Ian M Goodyer, James Wason, Stephen Pilling, Peter Fonagy\",\"doi\":\"10.1007/s00787-024-02592-7\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Introduction: </strong>Accurate prediction of short-term offending in young people exhibiting antisocial behaviour could support targeted interventions. Here we develop a set of machine learning (ML) models that predict offending status with good accuracy; furthermore, we show interpretable ML analyses can complement models to inform clinical decision-making.</p><p><strong>Methods: </strong>This study included 679 individuals aged 11-17 years who displayed moderate-to-severe antisocial behaviour, from a controlled trial of Multisystemic-therapy in England. The outcome was any criminal offence in the 18 months after study baseline. Four types of ML algorithms were trained: logistic regression, elastic net regression, random forest, and gradient boosting machine (GBM). Prediction models were developed (1) using predictors readily available to clinicians (e.g. sociodemographics, previous convictions), and (2) with additional information (e.g. parenting). Model agnostic feature importance values were calculated and the most important predictors identified. Nested cross-validation with 100 iterations of random data splits and 10-fold cross-validation within each iteration was employed, and the average predictive performance was reported.</p><p><strong>Results: </strong>Among the ML models using readily available predictors, the GBM is the strongest model (AUC 0.85, 95% CI 0.85-0.86); the other models have average AUCs of 0.82. This performance was better than using only the total number of previous offences as the predictor (0.67, 0.66-0.68), and the model simply assuming past offending status as the prediction (0.81, 0.80-0.81). Additional predictors slightly increased the performance of logistic regression and random forest models but decreased the performance of elastic net regression and gradient boosting machine-based models.</p><p><strong>Conclusion: </strong>The potential utility of ML approaches for accurately predicting criminal offences in high-risk youth is demonstrated. Interpretable ML-based predictive models could be utilised in youth services or research to help develop and deliver effective interventions.</p>\",\"PeriodicalId\":11856,\"journal\":{\"name\":\"European Child & Adolescent Psychiatry\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":6.0000,\"publicationDate\":\"2024-10-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"European Child & Adolescent Psychiatry\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1007/s00787-024-02592-7\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"PEDIATRICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"European Child & Adolescent Psychiatry","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1007/s00787-024-02592-7","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PEDIATRICS","Score":null,"Total":0}
引用次数: 0
摘要
简介对表现出反社会行为的青少年的短期犯罪情况进行准确预测有助于采取有针对性的干预措施。在此,我们开发了一套机器学习(ML)模型,可准确预测犯罪状况;此外,我们还展示了可解释的 ML 分析,可对模型进行补充,为临床决策提供信息:本研究从英格兰的一项多系统疗法对照试验中选取了 679 名年龄在 11-17 岁之间、有中度至重度反社会行为的人。研究结果为研究基线后 18 个月内的任何刑事犯罪。对四种 ML 算法进行了训练:逻辑回归、弹性网回归、随机森林和梯度提升机 (GBM)。预测模型的开发:(1)使用临床医生随时可用的预测因子(如社会人口学、前科);(2)使用附加信息(如父母教育)。计算出与模型无关的特征重要性值,并确定最重要的预测因子。采用了嵌套交叉验证,随机数据分割迭代 100 次,每次迭代 10 倍交叉验证,并报告了平均预测性能:结果:在使用现成预测因子的 ML 模型中,GBM 是最强的模型(AUC 0.85,95% CI 0.85-0.86);其他模型的平均 AUC 为 0.82。这一表现优于仅使用既往犯罪总数作为预测因子的模型(0.67,0.66-0.68),也优于简单假定既往犯罪状况作为预测因子的模型(0.81,0.80-0.81)。附加预测因子略微提高了逻辑回归和随机森林模型的性能,但降低了弹性净回归和基于梯度提升机器的模型的性能:结论:ML 方法在准确预测高危青少年刑事犯罪方面的潜在效用得到了证明。可解释的基于 ML 的预测模型可用于青少年服务或研究,以帮助开发和提供有效的干预措施。
Predicting criminal offence in adolescents who exhibit antisocial behaviour: a machine learning study using data from a large randomised controlled trial of multisystemic therapy.
Introduction: Accurate prediction of short-term offending in young people exhibiting antisocial behaviour could support targeted interventions. Here we develop a set of machine learning (ML) models that predict offending status with good accuracy; furthermore, we show interpretable ML analyses can complement models to inform clinical decision-making.
Methods: This study included 679 individuals aged 11-17 years who displayed moderate-to-severe antisocial behaviour, from a controlled trial of Multisystemic-therapy in England. The outcome was any criminal offence in the 18 months after study baseline. Four types of ML algorithms were trained: logistic regression, elastic net regression, random forest, and gradient boosting machine (GBM). Prediction models were developed (1) using predictors readily available to clinicians (e.g. sociodemographics, previous convictions), and (2) with additional information (e.g. parenting). Model agnostic feature importance values were calculated and the most important predictors identified. Nested cross-validation with 100 iterations of random data splits and 10-fold cross-validation within each iteration was employed, and the average predictive performance was reported.
Results: Among the ML models using readily available predictors, the GBM is the strongest model (AUC 0.85, 95% CI 0.85-0.86); the other models have average AUCs of 0.82. This performance was better than using only the total number of previous offences as the predictor (0.67, 0.66-0.68), and the model simply assuming past offending status as the prediction (0.81, 0.80-0.81). Additional predictors slightly increased the performance of logistic regression and random forest models but decreased the performance of elastic net regression and gradient boosting machine-based models.
Conclusion: The potential utility of ML approaches for accurately predicting criminal offences in high-risk youth is demonstrated. Interpretable ML-based predictive models could be utilised in youth services or research to help develop and deliver effective interventions.
期刊介绍:
European Child and Adolescent Psychiatry is Europe''s only peer-reviewed journal entirely devoted to child and adolescent psychiatry. It aims to further a broad understanding of psychopathology in children and adolescents. Empirical research is its foundation, and clinical relevance is its hallmark.
European Child and Adolescent Psychiatry welcomes in particular papers covering neuropsychiatry, cognitive neuroscience, genetics, neuroimaging, pharmacology, and related fields of interest. Contributions are encouraged from all around the world.