Sergazy Narynov, Daniyar Mukhtarkhanuly, B. Omarov, K. Kozhakhmet, Bauyrzhan Omarov
{"title":"Machine Learning Approach to Identifying Depression Related Posts on Social Media","authors":"Sergazy Narynov, Daniyar Mukhtarkhanuly, B. Omarov, K. Kozhakhmet, Bauyrzhan Omarov","doi":"10.23919/ICCAS50221.2020.9268336","DOIUrl":null,"url":null,"abstract":"According to the latest who data published in 2017, the number of suicides in Kazakhstan was 4855, or 3.55% of the total number of deaths. The age-adjusted death rate is 27.74 per 100,000 population. Kazakhstan is ranked 4th in the world by this indicator. This article compares machine learning algorithms with and without a teacher to identify depressive content in social media posts, with a focus on hopelessness and psychological pain for semantic analysis as key causes of suicide. Suicide is not spontaneous, and preparation for suicide can last about a year, during which time a person will show signs of their condition in our case by posting depressive content on their social network profile. This algorithm helps in detecting depressive content that can cause suicide to help people find confident help from psychologists at the national center for suicide prevention in Kazakhstan. Having obtained the highest score for 95% of the f1 score for a random forest (training with a teacher) with the tf-idf vectorization model, we can conclude by saying that the K-means algorithm(training without a teacher) using tf-idf shows impressive results that are only 4% lower in f1 and accuracy.","PeriodicalId":6732,"journal":{"name":"2020 20th International Conference on Control, Automation and Systems (ICCAS)","volume":"4 1","pages":"6-11"},"PeriodicalIF":0.0000,"publicationDate":"2020-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 20th International Conference on Control, Automation and Systems (ICCAS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.23919/ICCAS50221.2020.9268336","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
According to the latest who data published in 2017, the number of suicides in Kazakhstan was 4855, or 3.55% of the total number of deaths. The age-adjusted death rate is 27.74 per 100,000 population. Kazakhstan is ranked 4th in the world by this indicator. This article compares machine learning algorithms with and without a teacher to identify depressive content in social media posts, with a focus on hopelessness and psychological pain for semantic analysis as key causes of suicide. Suicide is not spontaneous, and preparation for suicide can last about a year, during which time a person will show signs of their condition in our case by posting depressive content on their social network profile. This algorithm helps in detecting depressive content that can cause suicide to help people find confident help from psychologists at the national center for suicide prevention in Kazakhstan. Having obtained the highest score for 95% of the f1 score for a random forest (training with a teacher) with the tf-idf vectorization model, we can conclude by saying that the K-means algorithm(training without a teacher) using tf-idf shows impressive results that are only 4% lower in f1 and accuracy.