Machine learning algorithm-based estimation model for the severity of depression assessed using Montgomery-Asberg depression rating scale.

IF 2 Q3 NEUROSCIENCES Neuropsychopharmacology Reports Pub Date : 2024-03-01 Epub Date: 2023-12-20 DOI:10.1002/npr2.12404

Masanori Shimamoto, Kanako Ishizuka, Kento Ohtani, Toshiya Inada, Maeri Yamamoto, Masako Tachibana, Hiroki Kimura, Yusuke Sakai, Kazuhiro Kobayashi, Norio Ozaki, Masashi Ikeda

{"title":"Machine learning algorithm-based estimation model for the severity of depression assessed using Montgomery-Asberg depression rating scale.","authors":"Masanori Shimamoto, Kanako Ishizuka, Kento Ohtani, Toshiya Inada, Maeri Yamamoto, Masako Tachibana, Hiroki Kimura, Yusuke Sakai, Kazuhiro Kobayashi, Norio Ozaki, Masashi Ikeda","doi":"10.1002/npr2.12404","DOIUrl":null,"url":null,"abstract":"Aim: Depressive disorder is often evaluated using established rating scales. However, consistent data collection with these scales requires trained professionals. In the present study, the \"rater & estimation-system\" reliability was assessed between consensus evaluation by trained psychiatrists and the estimation by 2 models of the AI-MADRS (Montgomery-Asberg Depression Rating Scale) estimation system, a machine learning algorithm-based model developed to assess the severity of depression.Methods: During interviews with trained psychiatrists and the AI-MADRS estimation system, patients responded orally to machine-generated voice prompts from the AI-MADRS structured interview questions. The severity scores estimated from two models of the AI-MADRS estimation system, the max estimation model and the average estimation model, were compared with those by trained psychiatrists.Results: A total of 51 evaluation interviews conducted on 30 patients were analyzed. Pearson's correlation coefficient with the scores evaluated by trained psychiatrists was 0.76 (95% confidence interval 0.62-0.86) for the max estimation model, and 0.86 (0.76-0.92) for the average estimation model. The ANOVA ICC rater & estimation-system reliability with the evaluation scores by trained psychiatrists was 0.51 (-0.09 to 0.79) for the max estimation model, and 0.75 (0.55-0.86) for the average estimation model.Conclusion: The average estimation model of AI-MADRS demonstrated substantially acceptable rater & estimation-system reliability with trained psychiatrists. Accumulating a broader training dataset and the refinement of AI-MADRS interviews are expected to improve the performance of AI-MADRS. Our findings suggest that AI technologies can significantly modernize and potentially revolutionize the realm of depression assessments.","PeriodicalId":19137,"journal":{"name":"Neuropsychopharmacology Reports","volume":" ","pages":"115-120"},"PeriodicalIF":2.0000,"publicationDate":"2024-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10932776/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neuropsychopharmacology Reports","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1002/npr2.12404","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2023/12/20 0:00:00","PubModel":"Epub","JCR":"Q3","JCRName":"NEUROSCIENCES","Score":null,"Total":0}

引用次数: 0

Abstract

Aim: Depressive disorder is often evaluated using established rating scales. However, consistent data collection with these scales requires trained professionals. In the present study, the "rater & estimation-system" reliability was assessed between consensus evaluation by trained psychiatrists and the estimation by 2 models of the AI-MADRS (Montgomery-Asberg Depression Rating Scale) estimation system, a machine learning algorithm-based model developed to assess the severity of depression.

Methods: During interviews with trained psychiatrists and the AI-MADRS estimation system, patients responded orally to machine-generated voice prompts from the AI-MADRS structured interview questions. The severity scores estimated from two models of the AI-MADRS estimation system, the max estimation model and the average estimation model, were compared with those by trained psychiatrists.

Results: A total of 51 evaluation interviews conducted on 30 patients were analyzed. Pearson's correlation coefficient with the scores evaluated by trained psychiatrists was 0.76 (95% confidence interval 0.62-0.86) for the max estimation model, and 0.86 (0.76-0.92) for the average estimation model. The ANOVA ICC rater & estimation-system reliability with the evaluation scores by trained psychiatrists was 0.51 (-0.09 to 0.79) for the max estimation model, and 0.75 (0.55-0.86) for the average estimation model.

Conclusion: The average estimation model of AI-MADRS demonstrated substantially acceptable rater & estimation-system reliability with trained psychiatrists. Accumulating a broader training dataset and the refinement of AI-MADRS interviews are expected to improve the performance of AI-MADRS. Our findings suggest that AI technologies can significantly modernize and potentially revolutionize the realm of depression assessments.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

使用蒙哥马利-阿斯伯格抑郁评分量表评估抑郁严重程度的基于机器学习算法的估算模型。

目的：抑郁障碍通常使用既定的评分量表进行评估。然而，使用这些量表收集一致的数据需要训练有素的专业人员。在本研究中，我们评估了由训练有素的精神科医生进行的共识评估与 AI-MADRS（蒙哥马利-阿斯伯格抑郁评定量表）估算系统的两个模型估算之间的 "评分者与估算系统 "可靠性：方法：在与训练有素的精神科医生和AI-MADRS评估系统进行访谈期间，患者对AI-MADRS结构化访谈问题中由机器生成的语音提示进行口头回答。通过AI-MADRS估算系统的两个模型（最大估算模型和平均估算模型）估算出的严重程度分数与经过培训的精神科医生估算出的分数进行了比较：结果：共分析了对 30 名患者进行的 51 次评估访谈。最大估算模型与训练有素的精神科医生评估分数的皮尔逊相关系数为 0.76（95% 置信区间为 0.62-0.86），平均估算模型为 0.86（0.76-0.92）。最大估算模型与训练有素的精神科医生评估分数之间的方差分析 ICC 评分者与估算系统可靠性为 0.51（-0.09 至 0.79），平均估算模型为 0.75（0.55 至 0.86）：结论：AI-MADRS的平均估测模型显示，在训练有素的精神科医生中，评分者和估测系统的可靠性基本可以接受。积累更广泛的训练数据集和改进 AI-MADRS 访谈有望提高 AI-MADRS 的性能。我们的研究结果表明，人工智能技术可以极大地推动抑郁评估领域的现代化，并有可能带来革命性的变化。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊