Twitter-STMHD: An Extensive User-Level Database of Multiple Mental Health Disorders

Suhavi, A. Singh, Udit Arora, Somyadeep Shrivastava, Aryaveer Singh, R. Shah, P. Kumaraguru
{"title":"Twitter-STMHD: An Extensive User-Level Database of Multiple Mental Health Disorders","authors":"Suhavi, A. Singh, Udit Arora, Somyadeep Shrivastava, Aryaveer Singh, R. Shah, P. Kumaraguru","doi":"10.1609/icwsm.v16i1.19368","DOIUrl":null,"url":null,"abstract":"Social Media is equipped with the ability to track and quantify user behavior, establishing it as an appropriate resource for mental health studies. However, previous efforts in the area have been limited by the lack of data and contextually relevant information. There is a need for large-scale, well-labeled mental health datasets with fast reproducible methods to facilitate their heuristic growth. In this paper, we cater to this need by building the Twitter - Self-Reported Temporally-Contextual Mental Health Diagnosis Dataset (Twitter-STMHD), a large scale, user-level dataset grouped into 8 disorder categories and a companion class of control users. The dataset is 60% hand-annotated, which lead to the creation of high-precision self-reported diagnosis report patterns, used for the construction of the rest of the dataset. The dataset, instead of being a corpus of tweets, is a collection of user-profiles of those suffering from mental health disorders to provide a holistic view of the problem statement. By leveraging temporal information, the data for a given profile in the dataset has been collected for disease prevalence periods: onset of disorder, diagnosis and progression, along with a fourth period: COVID-19. This is the only and the largest dataset that captures the tweeting activity of users suffering from mental health disorders during the COVID-19 period.","PeriodicalId":175641,"journal":{"name":"International Conference on Web and Social Media","volume":"70 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Conference on Web and Social Media","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1609/icwsm.v16i1.19368","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Social Media is equipped with the ability to track and quantify user behavior, establishing it as an appropriate resource for mental health studies. However, previous efforts in the area have been limited by the lack of data and contextually relevant information. There is a need for large-scale, well-labeled mental health datasets with fast reproducible methods to facilitate their heuristic growth. In this paper, we cater to this need by building the Twitter - Self-Reported Temporally-Contextual Mental Health Diagnosis Dataset (Twitter-STMHD), a large scale, user-level dataset grouped into 8 disorder categories and a companion class of control users. The dataset is 60% hand-annotated, which lead to the creation of high-precision self-reported diagnosis report patterns, used for the construction of the rest of the dataset. The dataset, instead of being a corpus of tweets, is a collection of user-profiles of those suffering from mental health disorders to provide a holistic view of the problem statement. By leveraging temporal information, the data for a given profile in the dataset has been collected for disease prevalence periods: onset of disorder, diagnosis and progression, along with a fourth period: COVID-19. This is the only and the largest dataset that captures the tweeting activity of users suffering from mental health disorders during the COVID-19 period.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Twitter-STMHD:多种精神健康障碍的广泛用户级数据库
社交媒体具备跟踪和量化用户行为的能力,使其成为心理健康研究的适当资源。然而,由于缺乏数据和相关信息,这方面的工作受到限制。需要大规模、标记良好的心理健康数据集,并采用快速可重复的方法,以促进其启发式增长。在本文中,我们通过构建Twitter-自我报告的时间上下文心理健康诊断数据集(Twitter- stmhd)来满足这一需求,这是一个大规模的用户级数据集,分为8个障碍类别和一个同伴类的控制用户。该数据集有60%是手工注释的,这导致了高精度自报告诊断报告模式的创建,用于构建数据集的其余部分。该数据集不是tweet的语料,而是那些患有精神健康障碍的用户资料的集合,以提供问题陈述的整体视图。通过利用时间信息,收集了数据集中特定概况的疾病流行期(发病、诊断和进展)以及第四个时期(COVID-19)的数据。这是捕获COVID-19期间患有精神健康障碍的用户的推文活动的唯一和最大的数据集。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
RTANet: Recommendation Target-Aware Network Embedding Who Is behind a Trend? Temporal Analysis of Interactions among Trend Participants on Twitter Host-Centric Social Connectedness of Migrants in Europe on Facebook Recipe Networks and the Principles of Healthy Food on the Web Social Influence-Maximizing Group Recommendation
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1