Machine learning and deep learning-based advanced classification techniques for the detection of major depressive disorder

IF 2.4 3区 管理学 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS Aslib Journal of Information Management Pub Date : 2023-07-11 DOI:10.1108/ajim-10-2022-0468
Abhinandan Chatterjee, P. Bala, Shruti Gedam, Sanchita Paul, N. Goyal
{"title":"Machine learning and deep learning-based advanced classification techniques for the detection of major depressive disorder","authors":"Abhinandan Chatterjee, P. Bala, Shruti Gedam, Sanchita Paul, N. Goyal","doi":"10.1108/ajim-10-2022-0468","DOIUrl":null,"url":null,"abstract":"PurposeDepression is a mental health problem characterized by a persistent sense of sadness and loss of interest. EEG signals are regarded as the most appropriate instruments for diagnosing depression because they reflect the operating status of the human brain. The purpose of this study is the early detection of depression among people using EEG signals.Design/methodology/approach(i) Artifacts are removed by filtering and linear and non-linear features are extracted; (ii) feature scaling is done using a standard scalar while principal component analysis (PCA) is used for feature reduction; (iii) the linear, non-linear and combination of both (only for those whose accuracy is highest) are taken for further analysis where some ML and DL classifiers are applied for the classification of depression; and (iv) in this study, total 15 distinct ML and DL methods, including KNN, SVM, bagging SVM, RF, GB, Extreme Gradient Boosting, MNB, Adaboost, Bagging RF, BootAgg, Gaussian NB, RNN, 1DCNN, RBFNN and LSTM, that have been effectively utilized as classifiers to handle a variety of real-world issues.Findings1. Among all, alpha, alpha asymmetry, gamma and gamma asymmetry give the best results in linear features, while RWE, DFA, CD and AE give the best results in non-linear feature. 2. In the linear features, gamma and alpha asymmetry have given 99.98% accuracy for Bagging RF, while gamma asymmetry has given 99.98% accuracy for BootAgg. 3. For non-linear features, it has been shown 99.84% of accuracy for RWE and DFA in RF, 99.97% accuracy for DFA in XGBoost and 99.94% accuracy for RWE in BootAgg. 4. By using DL, in linear features, gamma asymmetry has given more than 96% accuracy in RNN and 91% accuracy in LSTM and for non-linear features, 89% accuracy has been achieved for CD and AE in LSTM. 5. By combining linear and non-linear features, the highest accuracy was achieved in Bagging RF (98.50%) gamma asymmetry + RWE. In DL, Alpha + RWE, Gamma asymmetry + CD and gamma asymmetry + RWE have achieved 98% accuracy in LSTM.Originality/valueA novel dataset was collected from the Central Institute of Psychiatry (CIP), Ranchi which was recorded using a 128-channels whereas major previous studies used fewer channels; the details of the study participants are summarized and a model is developed for statistical analysis using N-way ANOVA; artifacts are removed by high and low pass filtering of epoch data followed by re-referencing and independent component analysis for noise removal; linear features, namely, band power and interhemispheric asymmetry and non-linear features, namely, relative wavelet energy, wavelet entropy, Approximate entropy, sample entropy, detrended fluctuation analysis and correlation dimension are extracted; this model utilizes Epoch (213,072) for 5 s EEG data, which allows the model to train for longer, thereby increasing the efficiency of classifiers. Features scaling is done using a standard scalar rather than normalization because it helps increase the accuracy of the models (especially for deep learning algorithms) while PCA is used for feature reduction; the linear, non-linear and combination of both features are taken for extensive analysis in conjunction with ML and DL classifiers for the classification of depression. The combination of linear and non-linear features (only for those whose accuracy is highest) is used for the best detection results.","PeriodicalId":53152,"journal":{"name":"Aslib Journal of Information Management","volume":" ","pages":""},"PeriodicalIF":2.4000,"publicationDate":"2023-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Aslib Journal of Information Management","FirstCategoryId":"91","ListUrlMain":"https://doi.org/10.1108/ajim-10-2022-0468","RegionNum":3,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0

Abstract

PurposeDepression is a mental health problem characterized by a persistent sense of sadness and loss of interest. EEG signals are regarded as the most appropriate instruments for diagnosing depression because they reflect the operating status of the human brain. The purpose of this study is the early detection of depression among people using EEG signals.Design/methodology/approach(i) Artifacts are removed by filtering and linear and non-linear features are extracted; (ii) feature scaling is done using a standard scalar while principal component analysis (PCA) is used for feature reduction; (iii) the linear, non-linear and combination of both (only for those whose accuracy is highest) are taken for further analysis where some ML and DL classifiers are applied for the classification of depression; and (iv) in this study, total 15 distinct ML and DL methods, including KNN, SVM, bagging SVM, RF, GB, Extreme Gradient Boosting, MNB, Adaboost, Bagging RF, BootAgg, Gaussian NB, RNN, 1DCNN, RBFNN and LSTM, that have been effectively utilized as classifiers to handle a variety of real-world issues.Findings1. Among all, alpha, alpha asymmetry, gamma and gamma asymmetry give the best results in linear features, while RWE, DFA, CD and AE give the best results in non-linear feature. 2. In the linear features, gamma and alpha asymmetry have given 99.98% accuracy for Bagging RF, while gamma asymmetry has given 99.98% accuracy for BootAgg. 3. For non-linear features, it has been shown 99.84% of accuracy for RWE and DFA in RF, 99.97% accuracy for DFA in XGBoost and 99.94% accuracy for RWE in BootAgg. 4. By using DL, in linear features, gamma asymmetry has given more than 96% accuracy in RNN and 91% accuracy in LSTM and for non-linear features, 89% accuracy has been achieved for CD and AE in LSTM. 5. By combining linear and non-linear features, the highest accuracy was achieved in Bagging RF (98.50%) gamma asymmetry + RWE. In DL, Alpha + RWE, Gamma asymmetry + CD and gamma asymmetry + RWE have achieved 98% accuracy in LSTM.Originality/valueA novel dataset was collected from the Central Institute of Psychiatry (CIP), Ranchi which was recorded using a 128-channels whereas major previous studies used fewer channels; the details of the study participants are summarized and a model is developed for statistical analysis using N-way ANOVA; artifacts are removed by high and low pass filtering of epoch data followed by re-referencing and independent component analysis for noise removal; linear features, namely, band power and interhemispheric asymmetry and non-linear features, namely, relative wavelet energy, wavelet entropy, Approximate entropy, sample entropy, detrended fluctuation analysis and correlation dimension are extracted; this model utilizes Epoch (213,072) for 5 s EEG data, which allows the model to train for longer, thereby increasing the efficiency of classifiers. Features scaling is done using a standard scalar rather than normalization because it helps increase the accuracy of the models (especially for deep learning algorithms) while PCA is used for feature reduction; the linear, non-linear and combination of both features are taken for extensive analysis in conjunction with ML and DL classifiers for the classification of depression. The combination of linear and non-linear features (only for those whose accuracy is highest) is used for the best detection results.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于机器学习和深度学习的重度抑郁症检测高级分类技术
抑郁症是一种心理健康问题,其特征是持续感到悲伤和失去兴趣。脑电图信号反映了人脑的工作状态,被认为是诊断抑郁症最合适的工具。本研究的目的是利用脑电图信号早期检测抑郁症。设计/方法/方法(1)通过滤波去除伪影,提取线性和非线性特征;(ii)使用标准标量进行特征缩放,而主成分分析(PCA)用于特征约简;(iii)采用线性、非线性和两者的组合(仅适用于准确率最高的分类器)进行进一步分析,其中一些ML和DL分类器应用于抑郁症分类;(iv)在本研究中,共有15种不同的ML和DL方法,包括KNN、SVM、bagging SVM、RF、GB、Extreme Gradient Boosting、MNB、Adaboost、bagging RF、BootAgg、Gaussian NB、RNN、1DCNN、RBFNN和LSTM,这些方法已被有效地用作分类器来处理各种现实问题。其中,alpha、alpha不对称、gamma和gamma不对称在线性特征上效果最好,而RWE、DFA、CD和AE在非线性特征上效果最好。2. 在线性特征中,gamma和alpha不对称为Bagging RF提供了99.98%的准确率,而gamma不对称为BootAgg提供了99.98%的准确率。3.对于非线性特征,RF中的RWE和DFA准确率为99.84%,XGBoost中的DFA准确率为99.97%,BootAgg中的RWE准确率为99.94%。4. 通过使用深度学习,在线性特征中,伽马不对称在RNN中准确率超过96%,在LSTM中准确率超过91%,对于非线性特征,在LSTM中CD和AE的准确率达到89%。5. 通过结合线性和非线性特征,Bagging RF (98.50%) γ不对称+ RWE达到了最高的精度。在DL中,Alpha + RWE、Gamma不对称+ CD和Gamma不对称+ RWE在LSTM中的准确率达到98%。原创性/价值从兰契中央精神病学研究所(CIP)收集了一个新的数据集,该数据集使用128个通道进行记录,而以前的主要研究使用较少的通道;总结了研究参与者的详细信息,并使用N-way方差分析开发了一个模型进行统计分析;通过对历元数据进行高通和低通滤波,然后进行重新引用和独立分量分析以去除噪声,从而去除伪影;提取线性特征,即频带功率和半球间不对称性;提取非线性特征,即相对小波能量、小波熵、近似熵、样本熵、去趋势波动分析和相关维数;该模型利用Epoch(213072)对5秒的EEG数据进行训练,使得模型训练时间更长,从而提高了分类器的效率。特征缩放使用标准标量而不是归一化,因为它有助于提高模型的准确性(特别是对于深度学习算法),而PCA用于特征约简;将这两种特征的线性、非线性和组合与ML和DL分类器一起进行广泛的分析,用于抑郁症的分类。线性和非线性特征的结合(仅适用于精度最高的特征)用于获得最佳检测结果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Aslib Journal of Information Management
Aslib Journal of Information Management COMPUTER SCIENCE, INFORMATION SYSTEMS-
CiteScore
5.30
自引率
19.20%
发文量
79
期刊介绍: Aslib Journal of Information Management covers a broad range of issues in the field, including economic, behavioural, social, ethical, technological, international, business-related, political and management-orientated factors. Contributors are encouraged to spell out the practical implications of their work. Aslib Journal of Information Management Areas of interest include topics such as social media, data protection, search engines, information retrieval, digital libraries, information behaviour, intellectual property and copyright, information industry, digital repositories and information policy and governance.
期刊最新文献
Exploring the impact of team engagement on patient satisfaction: insights from social support and transactive memory system Factors affecting user intention to use social commerce continuously from a habit perspective What decision-making process do mHealth users go through when faced with privacy disclosure behaviors? A dual trade-off perspective Collaborative online shopping: customer satisfaction and the influence of product type, gender and involvement An associative text analyzer to facilitate effectiveness of exploring historical texts for digital humanities
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1