Predicting Suicide Risk from Online Postings in Reddit The UGent-IDLab submission to the CLPysch 2019 Shared Task A

Proceedings of the Sixth Workshop on Computational Linguistics and Clinical Psychology Pub Date : 2019-06-01 DOI:10.18653/v1/W19-3019

Semere Kiros Bitew, Giannis Bekoulis, Johannes Deleu, Lucas Sterckx, Klim Zaporojets, Thomas Demeester, Chris Develder

引用次数: 6

Abstract

This paper describes IDLab’s text classification systems submitted to Task A as part of the CLPsych 2019 shared task. The aim of this shared task was to develop automated systems that predict the degree of suicide risk of people based on their posts on Reddit. Bag-of-words features, emotion features and post level predictions are used to derive user-level predictions. Linear models and ensembles of these models are used to predict final scores. We find that predicting fine-grained risk levels is much more difficult than flagging potentially at-risk users. Furthermore, we do not find clear added value from building richer ensembles compared to simple baselines, given the available training data and the nature of the prediction task.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

UGent-IDLab提交给CLPysch 2019共享任务A

本文描述了作为CLPsych 2019共享任务的一部分提交给任务A的IDLab文本分类系统。这项共享任务的目的是开发自动化系统，根据人们在Reddit上的帖子来预测他们的自杀风险程度。使用词袋特征、情感特征和帖子级别预测来推导用户级别预测。使用线性模型和这些模型的集合来预测最终分数。我们发现，预测细粒度的风险水平比标记潜在风险用户要困难得多。此外，考虑到可用的训练数据和预测任务的性质，我们没有发现与简单基线相比，构建更丰富的集成的明显附加价值。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Proceedings of the Sixth Workshop on Computational Linguistics and Clinical Psychology

自引率

0.00%

发文量