{"title":"使用蒙哥马利-阿斯伯格抑郁评分量表评估抑郁严重程度的基于机器学习算法的估算模型。","authors":"Masanori Shimamoto, Kanako Ishizuka, Kento Ohtani, Toshiya Inada, Maeri Yamamoto, Masako Tachibana, Hiroki Kimura, Yusuke Sakai, Kazuhiro Kobayashi, Norio Ozaki, Masashi Ikeda","doi":"10.1002/npr2.12404","DOIUrl":null,"url":null,"abstract":"<p><strong>Aim: </strong>Depressive disorder is often evaluated using established rating scales. However, consistent data collection with these scales requires trained professionals. In the present study, the \"rater & estimation-system\" reliability was assessed between consensus evaluation by trained psychiatrists and the estimation by 2 models of the AI-MADRS (Montgomery-Asberg Depression Rating Scale) estimation system, a machine learning algorithm-based model developed to assess the severity of depression.</p><p><strong>Methods: </strong>During interviews with trained psychiatrists and the AI-MADRS estimation system, patients responded orally to machine-generated voice prompts from the AI-MADRS structured interview questions. The severity scores estimated from two models of the AI-MADRS estimation system, the max estimation model and the average estimation model, were compared with those by trained psychiatrists.</p><p><strong>Results: </strong>A total of 51 evaluation interviews conducted on 30 patients were analyzed. Pearson's correlation coefficient with the scores evaluated by trained psychiatrists was 0.76 (95% confidence interval 0.62-0.86) for the max estimation model, and 0.86 (0.76-0.92) for the average estimation model. The ANOVA ICC rater & estimation-system reliability with the evaluation scores by trained psychiatrists was 0.51 (-0.09 to 0.79) for the max estimation model, and 0.75 (0.55-0.86) for the average estimation model.</p><p><strong>Conclusion: </strong>The average estimation model of AI-MADRS demonstrated substantially acceptable rater & estimation-system reliability with trained psychiatrists. Accumulating a broader training dataset and the refinement of AI-MADRS interviews are expected to improve the performance of AI-MADRS. Our findings suggest that AI technologies can significantly modernize and potentially revolutionize the realm of depression assessments.</p>","PeriodicalId":19137,"journal":{"name":"Neuropsychopharmacology Reports","volume":null,"pages":null},"PeriodicalIF":2.0000,"publicationDate":"2024-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10932776/pdf/","citationCount":"0","resultStr":"{\"title\":\"Machine learning algorithm-based estimation model for the severity of depression assessed using Montgomery-Asberg depression rating scale.\",\"authors\":\"Masanori Shimamoto, Kanako Ishizuka, Kento Ohtani, Toshiya Inada, Maeri Yamamoto, Masako Tachibana, Hiroki Kimura, Yusuke Sakai, Kazuhiro Kobayashi, Norio Ozaki, Masashi Ikeda\",\"doi\":\"10.1002/npr2.12404\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Aim: </strong>Depressive disorder is often evaluated using established rating scales. However, consistent data collection with these scales requires trained professionals. In the present study, the \\\"rater & estimation-system\\\" reliability was assessed between consensus evaluation by trained psychiatrists and the estimation by 2 models of the AI-MADRS (Montgomery-Asberg Depression Rating Scale) estimation system, a machine learning algorithm-based model developed to assess the severity of depression.</p><p><strong>Methods: </strong>During interviews with trained psychiatrists and the AI-MADRS estimation system, patients responded orally to machine-generated voice prompts from the AI-MADRS structured interview questions. The severity scores estimated from two models of the AI-MADRS estimation system, the max estimation model and the average estimation model, were compared with those by trained psychiatrists.</p><p><strong>Results: </strong>A total of 51 evaluation interviews conducted on 30 patients were analyzed. Pearson's correlation coefficient with the scores evaluated by trained psychiatrists was 0.76 (95% confidence interval 0.62-0.86) for the max estimation model, and 0.86 (0.76-0.92) for the average estimation model. The ANOVA ICC rater & estimation-system reliability with the evaluation scores by trained psychiatrists was 0.51 (-0.09 to 0.79) for the max estimation model, and 0.75 (0.55-0.86) for the average estimation model.</p><p><strong>Conclusion: </strong>The average estimation model of AI-MADRS demonstrated substantially acceptable rater & estimation-system reliability with trained psychiatrists. Accumulating a broader training dataset and the refinement of AI-MADRS interviews are expected to improve the performance of AI-MADRS. Our findings suggest that AI technologies can significantly modernize and potentially revolutionize the realm of depression assessments.</p>\",\"PeriodicalId\":19137,\"journal\":{\"name\":\"Neuropsychopharmacology Reports\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":2.0000,\"publicationDate\":\"2024-03-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10932776/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Neuropsychopharmacology Reports\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1002/npr2.12404\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2023/12/20 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q3\",\"JCRName\":\"NEUROSCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neuropsychopharmacology Reports","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1002/npr2.12404","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2023/12/20 0:00:00","PubModel":"Epub","JCR":"Q3","JCRName":"NEUROSCIENCES","Score":null,"Total":0}
Machine learning algorithm-based estimation model for the severity of depression assessed using Montgomery-Asberg depression rating scale.
Aim: Depressive disorder is often evaluated using established rating scales. However, consistent data collection with these scales requires trained professionals. In the present study, the "rater & estimation-system" reliability was assessed between consensus evaluation by trained psychiatrists and the estimation by 2 models of the AI-MADRS (Montgomery-Asberg Depression Rating Scale) estimation system, a machine learning algorithm-based model developed to assess the severity of depression.
Methods: During interviews with trained psychiatrists and the AI-MADRS estimation system, patients responded orally to machine-generated voice prompts from the AI-MADRS structured interview questions. The severity scores estimated from two models of the AI-MADRS estimation system, the max estimation model and the average estimation model, were compared with those by trained psychiatrists.
Results: A total of 51 evaluation interviews conducted on 30 patients were analyzed. Pearson's correlation coefficient with the scores evaluated by trained psychiatrists was 0.76 (95% confidence interval 0.62-0.86) for the max estimation model, and 0.86 (0.76-0.92) for the average estimation model. The ANOVA ICC rater & estimation-system reliability with the evaluation scores by trained psychiatrists was 0.51 (-0.09 to 0.79) for the max estimation model, and 0.75 (0.55-0.86) for the average estimation model.
Conclusion: The average estimation model of AI-MADRS demonstrated substantially acceptable rater & estimation-system reliability with trained psychiatrists. Accumulating a broader training dataset and the refinement of AI-MADRS interviews are expected to improve the performance of AI-MADRS. Our findings suggest that AI technologies can significantly modernize and potentially revolutionize the realm of depression assessments.