{"title":"A Contemporarymulti–Objective Feature Selection Model for Depression Detection Using a Hybrid pBGSK Optimization Algorithm","authors":"S. Kavi Priya, K. Pon Karthika","doi":"10.34768/amcs-2023-0010","DOIUrl":null,"url":null,"abstract":"Abstract Depression is one of the primary causes of global mental illnesses and an underlying reason for suicide. The user generated text content available in social media forums offers an opportunity to build automatic and reliable depression detection models. The core objective of this work is to select an optimal set of features that may help in classifying depressive contents posted on social media. To this end, a novel multi-objective feature selection technique (EFS-pBGSK) and machine learning algorithms are employed to train the proposed model. The novel feature selection technique incorporates a binary gaining-sharing knowledge-based optimization algorithm with population reduction (pBGSK) to obtain the optimized features from the original feature space. The extensive feature selector (EFS) is used to filter out the excessive features based on their ranking. Two text depression datasets collected from Twitter and Reddit forums are used for the evaluation of the proposed feature selection model. The experimentation is carried out using naive Bayes (NB) and support vector machine (SVM) classifiers for five different feature subset sizes (10, 50, 100, 300 and 500). The experimental outcome indicates that the proposed model can achieve superior performance scores. The top results are obtained using the SVM classifier for the SDD dataset with 0.962 accuracy, 0.929 F1 score, 0.0809 log-loss and 0.0717 mean absolute error (MAE). As a result, the optimal combination of features selected by the proposed hybrid model significantly improves the performance of the depression detection system.","PeriodicalId":50339,"journal":{"name":"International Journal of Applied Mathematics and Computer Science","volume":"199 1","pages":"117 - 131"},"PeriodicalIF":1.6000,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Applied Mathematics and Computer Science","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.34768/amcs-2023-0010","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
引用次数: 1
Abstract
Abstract Depression is one of the primary causes of global mental illnesses and an underlying reason for suicide. The user generated text content available in social media forums offers an opportunity to build automatic and reliable depression detection models. The core objective of this work is to select an optimal set of features that may help in classifying depressive contents posted on social media. To this end, a novel multi-objective feature selection technique (EFS-pBGSK) and machine learning algorithms are employed to train the proposed model. The novel feature selection technique incorporates a binary gaining-sharing knowledge-based optimization algorithm with population reduction (pBGSK) to obtain the optimized features from the original feature space. The extensive feature selector (EFS) is used to filter out the excessive features based on their ranking. Two text depression datasets collected from Twitter and Reddit forums are used for the evaluation of the proposed feature selection model. The experimentation is carried out using naive Bayes (NB) and support vector machine (SVM) classifiers for five different feature subset sizes (10, 50, 100, 300 and 500). The experimental outcome indicates that the proposed model can achieve superior performance scores. The top results are obtained using the SVM classifier for the SDD dataset with 0.962 accuracy, 0.929 F1 score, 0.0809 log-loss and 0.0717 mean absolute error (MAE). As a result, the optimal combination of features selected by the proposed hybrid model significantly improves the performance of the depression detection system.
期刊介绍:
The International Journal of Applied Mathematics and Computer Science is a quarterly published in Poland since 1991 by the University of Zielona Góra in partnership with De Gruyter Poland (Sciendo) and Lubuskie Scientific Society, under the auspices of the Committee on Automatic Control and Robotics of the Polish Academy of Sciences.
The journal strives to meet the demand for the presentation of interdisciplinary research in various fields related to control theory, applied mathematics, scientific computing and computer science. In particular, it publishes high quality original research results in the following areas:
-modern control theory and practice-
artificial intelligence methods and their applications-
applied mathematics and mathematical optimisation techniques-
mathematical methods in engineering, computer science, and biology.