识别社交媒体上抑郁症相关帖子的机器学习方法

Sergazy Narynov, Daniyar Mukhtarkhanuly, B. Omarov, K. Kozhakhmet, Bauyrzhan Omarov
{"title":"识别社交媒体上抑郁症相关帖子的机器学习方法","authors":"Sergazy Narynov, Daniyar Mukhtarkhanuly, B. Omarov, K. Kozhakhmet, Bauyrzhan Omarov","doi":"10.23919/ICCAS50221.2020.9268336","DOIUrl":null,"url":null,"abstract":"According to the latest who data published in 2017, the number of suicides in Kazakhstan was 4855, or 3.55% of the total number of deaths. The age-adjusted death rate is 27.74 per 100,000 population. Kazakhstan is ranked 4th in the world by this indicator. This article compares machine learning algorithms with and without a teacher to identify depressive content in social media posts, with a focus on hopelessness and psychological pain for semantic analysis as key causes of suicide. Suicide is not spontaneous, and preparation for suicide can last about a year, during which time a person will show signs of their condition in our case by posting depressive content on their social network profile. This algorithm helps in detecting depressive content that can cause suicide to help people find confident help from psychologists at the national center for suicide prevention in Kazakhstan. Having obtained the highest score for 95% of the f1 score for a random forest (training with a teacher) with the tf-idf vectorization model, we can conclude by saying that the K-means algorithm(training without a teacher) using tf-idf shows impressive results that are only 4% lower in f1 and accuracy.","PeriodicalId":6732,"journal":{"name":"2020 20th International Conference on Control, Automation and Systems (ICCAS)","volume":"4 1","pages":"6-11"},"PeriodicalIF":0.0000,"publicationDate":"2020-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Machine Learning Approach to Identifying Depression Related Posts on Social Media\",\"authors\":\"Sergazy Narynov, Daniyar Mukhtarkhanuly, B. Omarov, K. Kozhakhmet, Bauyrzhan Omarov\",\"doi\":\"10.23919/ICCAS50221.2020.9268336\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"According to the latest who data published in 2017, the number of suicides in Kazakhstan was 4855, or 3.55% of the total number of deaths. The age-adjusted death rate is 27.74 per 100,000 population. Kazakhstan is ranked 4th in the world by this indicator. This article compares machine learning algorithms with and without a teacher to identify depressive content in social media posts, with a focus on hopelessness and psychological pain for semantic analysis as key causes of suicide. Suicide is not spontaneous, and preparation for suicide can last about a year, during which time a person will show signs of their condition in our case by posting depressive content on their social network profile. This algorithm helps in detecting depressive content that can cause suicide to help people find confident help from psychologists at the national center for suicide prevention in Kazakhstan. Having obtained the highest score for 95% of the f1 score for a random forest (training with a teacher) with the tf-idf vectorization model, we can conclude by saying that the K-means algorithm(training without a teacher) using tf-idf shows impressive results that are only 4% lower in f1 and accuracy.\",\"PeriodicalId\":6732,\"journal\":{\"name\":\"2020 20th International Conference on Control, Automation and Systems (ICCAS)\",\"volume\":\"4 1\",\"pages\":\"6-11\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-10-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 20th International Conference on Control, Automation and Systems (ICCAS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.23919/ICCAS50221.2020.9268336\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 20th International Conference on Control, Automation and Systems (ICCAS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.23919/ICCAS50221.2020.9268336","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

摘要

根据世卫组织2017年发布的最新数据,哈萨克斯坦的自杀人数为4855人,占死亡总人数的3.55%。年龄调整死亡率为每10万人27.74人。哈萨克斯坦在这一指标上排名世界第四。这篇文章比较了机器学习算法在有老师和没有老师的情况下识别社交媒体帖子中的抑郁内容,重点是绝望和心理痛苦的语义分析,作为自杀的主要原因。自杀不是自发的,自杀的准备工作可能会持续一年左右,在这段时间里,一个人会在他们的社交网络上发布抑郁的内容,从而显示出他们的状况。该算法有助于检测可能导致自杀的抑郁内容,帮助人们从哈萨克斯坦国家自杀预防中心的心理学家那里获得自信的帮助。在使用tf-idf矢量化模型获得随机森林(有老师的训练)95%的f1得分的最高分之后,我们可以得出结论,使用tf-idf的K-means算法(没有老师的训练)显示出令人印象深刻的结果,f1和准确率仅降低了4%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Machine Learning Approach to Identifying Depression Related Posts on Social Media
According to the latest who data published in 2017, the number of suicides in Kazakhstan was 4855, or 3.55% of the total number of deaths. The age-adjusted death rate is 27.74 per 100,000 population. Kazakhstan is ranked 4th in the world by this indicator. This article compares machine learning algorithms with and without a teacher to identify depressive content in social media posts, with a focus on hopelessness and psychological pain for semantic analysis as key causes of suicide. Suicide is not spontaneous, and preparation for suicide can last about a year, during which time a person will show signs of their condition in our case by posting depressive content on their social network profile. This algorithm helps in detecting depressive content that can cause suicide to help people find confident help from psychologists at the national center for suicide prevention in Kazakhstan. Having obtained the highest score for 95% of the f1 score for a random forest (training with a teacher) with the tf-idf vectorization model, we can conclude by saying that the K-means algorithm(training without a teacher) using tf-idf shows impressive results that are only 4% lower in f1 and accuracy.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Real-time quadrotor actuator fault detection and isolation using multivariate statistical analysis techniques with sensor measurements Autonomous docking of an Unmanned Surface Vehicle based on Reachability Analysis Clutch Torque Estimation of Ball-ramp Dual Clutch Transmission using Higher Order Disturbance Observer Robust Traffic Light Detection and Classification Under Day and Night Conditions Visual Surveillance using Deep Reinforcement Learning
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1