#PrayForDad: Learning the Semantics Behind Why Social Media Users Disclose Health Information

Zhijun Yin, You Chen, D. Fabbri, Jimeng Sun, B. Malin
{"title":"#PrayForDad: Learning the Semantics Behind Why Social Media Users Disclose Health Information","authors":"Zhijun Yin, You Chen, D. Fabbri, Jimeng Sun, B. Malin","doi":"10.1609/icwsm.v10i1.14735","DOIUrl":null,"url":null,"abstract":"User-generated content in social media is increasingly acknowledged as a rich resource for research into health problems. One particular area of interest is in the semantics individuals evoke because they can influence when health-related information is disclosed. While there have been multiple investigations into why self-disclose occurs, much less is known about when individuals choose to disclose information about other people (e.g., a relative), which is a significant privacy concern. In this paper, we introduce a novel framework to investigate how semantics influence disclosure routines for 34 health issues. This framework begins with a supervised classification model to distinguish tweets that communicate personal health issues from confounding concepts (e.g., metaphorical statements that include a health-related keyword). Next, we annotate tweets for each health issue with linguistic and psychological categories (e.g. social processes, affective processes and personal concerns). Then, we apply a non-negative matrix factorization over a health issue-by-language category space. Finally, the factorized basis space is leveraged to group health issues into natural aggregations based around how they are discussed. We evaluate this framework with four months of tweets (over 200 million) and show that certain semantics correspond with whom a health mention pertains to. Our findings show that health issues related with family members, high medical cost and social support (e.g., Alzheimer's Disease, cancer, and Down syndrome) lead to tweets that are more likely to disclose another individual's health status, while tweets with more benign health issues (e.g., allergy, arthritis, and bronchitis) with biological processes (e.g., health and ingestion) and negative emotions are more likely to contain self-disclosures.","PeriodicalId":74525,"journal":{"name":"Proceedings of the ... International AAAI Conference on Weblogs and Social Media. International AAAI Conference on Weblogs and Social Media","volume":"46 1","pages":"456-465"},"PeriodicalIF":0.0000,"publicationDate":"2016-03-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"14","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the ... International AAAI Conference on Weblogs and Social Media. International AAAI Conference on Weblogs and Social Media","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1609/icwsm.v10i1.14735","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 14

Abstract

User-generated content in social media is increasingly acknowledged as a rich resource for research into health problems. One particular area of interest is in the semantics individuals evoke because they can influence when health-related information is disclosed. While there have been multiple investigations into why self-disclose occurs, much less is known about when individuals choose to disclose information about other people (e.g., a relative), which is a significant privacy concern. In this paper, we introduce a novel framework to investigate how semantics influence disclosure routines for 34 health issues. This framework begins with a supervised classification model to distinguish tweets that communicate personal health issues from confounding concepts (e.g., metaphorical statements that include a health-related keyword). Next, we annotate tweets for each health issue with linguistic and psychological categories (e.g. social processes, affective processes and personal concerns). Then, we apply a non-negative matrix factorization over a health issue-by-language category space. Finally, the factorized basis space is leveraged to group health issues into natural aggregations based around how they are discussed. We evaluate this framework with four months of tweets (over 200 million) and show that certain semantics correspond with whom a health mention pertains to. Our findings show that health issues related with family members, high medical cost and social support (e.g., Alzheimer's Disease, cancer, and Down syndrome) lead to tweets that are more likely to disclose another individual's health status, while tweets with more benign health issues (e.g., allergy, arthritis, and bronchitis) with biological processes (e.g., health and ingestion) and negative emotions are more likely to contain self-disclosures.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
#为爸爸祈祷:学习社交媒体用户披露健康信息背后的语义
社交媒体上用户生成的内容日益被认为是研究健康问题的丰富资源。一个特别感兴趣的领域是个人唤起的语义,因为他们可以影响何时披露与健康有关的信息。虽然对自我表露的原因已经进行了多次调查,但人们对个人何时选择披露他人(例如,亲戚)的信息知之甚少,这是一个重要的隐私问题。在本文中,我们引入了一个新的框架来研究语义如何影响34个健康问题的披露程序。该框架从一个监督分类模型开始,以区分传达个人健康问题的推文与混淆概念(例如,包含与健康相关关键字的隐喻性陈述)。接下来,我们用语言和心理类别(例如社会过程、情感过程和个人关注)对每个健康问题的推文进行注释。然后,我们应用非负矩阵分解在健康问题的语言类别空间。最后,利用分解基空间根据讨论方式将健康问题分组为自然聚合。我们用四个月的推文(超过2亿条)来评估这个框架,并显示某些语义与健康提及相关的人相对应。我们的研究结果表明,与家庭成员、高医疗成本和社会支持相关的健康问题(如阿尔茨海默病、癌症和唐氏综合症)导致推文更有可能披露另一个人的健康状况,而带有更良性健康问题(如过敏、关节炎和支气管炎)、生物过程(如健康和摄入)和负面情绪的推文更有可能包含自我披露。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Negative Associations in Word Embeddings Predict Anti-black Bias across Regions-but Only via Name Frequency. Correcting Sociodemographic Selection Biases for Population Prediction from Social Media. Classifying Minority Stress Disclosure on Social Media with Bidirectional Long Short-Term Memory. Classifying Minority Stress Disclosure on Social Media with Bidirectional Long Short-Term Memory Tweet Classification to Assist Human Moderation for Suicide Prevention.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1