Differing Content and Language Based on Poster-Patient Relationships on the Chinese Social Media Platform Weibo: Text Classification, Sentiment Analysis, and Topic Modeling of Posts on Breast Cancer.

IF 3.3 Q2 ONCOLOGY JMIR Cancer Pub Date : 2024-05-09 DOI:10.2196/51332
Zhouqing Zhang, Kongmeng Liew, Roeline Kuijer, Wan Jou She, Shuntaro Yada, Shoko Wakamiya, Eiji Aramaki
{"title":"Differing Content and Language Based on Poster-Patient Relationships on the Chinese Social Media Platform Weibo: Text Classification, Sentiment Analysis, and Topic Modeling of Posts on Breast Cancer.","authors":"Zhouqing Zhang, Kongmeng Liew, Roeline Kuijer, Wan Jou She, Shuntaro Yada, Shoko Wakamiya, Eiji Aramaki","doi":"10.2196/51332","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Breast cancer affects the lives of not only those diagnosed but also the people around them. Many of those affected share their experiences on social media. However, these narratives may differ according to who the poster is and what their relationship with the patient is; a patient posting about their experiences may post different content from someone whose friends or family has breast cancer. Weibo is 1 of the most popular social media platforms in China, and breast cancer-related posts are frequently found there.</p><p><strong>Objective: </strong>With the goal of understanding the different experiences of those affected by breast cancer in China, we aimed to explore how content and language used in relevant posts differ according to who the poster is and what their relationship with the patient is and whether there are differences in emotional expression and topic content if the patient is the poster themselves or a friend, family member, relative, or acquaintance.</p><p><strong>Methods: </strong>We used Weibo as a resource to examine how posts differ according to the different poster-patient relationships. We collected a total of 10,322 relevant Weibo posts. Using a 2-step analysis method, we fine-tuned 2 Chinese Robustly Optimized Bidirectional Encoder Representations from Transformers (BERT) Pretraining Approach models on this data set with annotated poster-patient relationships. These models were lined in sequence, first a binary classifier (no_patient or patient) and then a multiclass classifier (post_user, family_members, friends_relatives, acquaintances, heard_relation), to classify poster-patient relationships. Next, we used the Linguistic Inquiry and Word Count lexicon to conduct sentiment analysis from 5 emotion categories (positive and negative emotions, anger, sadness, and anxiety), followed by topic modeling (BERTopic).</p><p><strong>Results: </strong>Our binary model (F<sub>1</sub>-score=0.92) and multiclass model (F<sub>1</sub>-score=0.83) were largely able to classify poster-patient relationships accurately. Subsequent sentiment analysis showed significant differences in emotion categories across all poster-patient relationships. Notably, negative emotions and anger were higher for the \"no_patient\" class, but sadness and anxiety were higher for the \"family_members\" class. Focusing on the top 30 topics, we also noted that topics on fears and anger toward cancer were higher in the \"no_patient\" class, but topics on cancer treatment were higher in the \"family_members\" class.</p><p><strong>Conclusions: </strong>Chinese users post different types of content, depending on the poster- poster-patient relationships. If the patient is family, posts are sadder and more anxious but also contain more content on treatments. However, if no patient is detected, posts show higher levels of anger. We think that these may stem from rants from posters, which may help with emotion regulation and gathering social support.</p>","PeriodicalId":45538,"journal":{"name":"JMIR Cancer","volume":null,"pages":null},"PeriodicalIF":3.3000,"publicationDate":"2024-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11117131/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"JMIR Cancer","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2196/51332","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ONCOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Background: Breast cancer affects the lives of not only those diagnosed but also the people around them. Many of those affected share their experiences on social media. However, these narratives may differ according to who the poster is and what their relationship with the patient is; a patient posting about their experiences may post different content from someone whose friends or family has breast cancer. Weibo is 1 of the most popular social media platforms in China, and breast cancer-related posts are frequently found there.

Objective: With the goal of understanding the different experiences of those affected by breast cancer in China, we aimed to explore how content and language used in relevant posts differ according to who the poster is and what their relationship with the patient is and whether there are differences in emotional expression and topic content if the patient is the poster themselves or a friend, family member, relative, or acquaintance.

Methods: We used Weibo as a resource to examine how posts differ according to the different poster-patient relationships. We collected a total of 10,322 relevant Weibo posts. Using a 2-step analysis method, we fine-tuned 2 Chinese Robustly Optimized Bidirectional Encoder Representations from Transformers (BERT) Pretraining Approach models on this data set with annotated poster-patient relationships. These models were lined in sequence, first a binary classifier (no_patient or patient) and then a multiclass classifier (post_user, family_members, friends_relatives, acquaintances, heard_relation), to classify poster-patient relationships. Next, we used the Linguistic Inquiry and Word Count lexicon to conduct sentiment analysis from 5 emotion categories (positive and negative emotions, anger, sadness, and anxiety), followed by topic modeling (BERTopic).

Results: Our binary model (F1-score=0.92) and multiclass model (F1-score=0.83) were largely able to classify poster-patient relationships accurately. Subsequent sentiment analysis showed significant differences in emotion categories across all poster-patient relationships. Notably, negative emotions and anger were higher for the "no_patient" class, but sadness and anxiety were higher for the "family_members" class. Focusing on the top 30 topics, we also noted that topics on fears and anger toward cancer were higher in the "no_patient" class, but topics on cancer treatment were higher in the "family_members" class.

Conclusions: Chinese users post different types of content, depending on the poster- poster-patient relationships. If the patient is family, posts are sadder and more anxious but also contain more content on treatments. However, if no patient is detected, posts show higher levels of anger. We think that these may stem from rants from posters, which may help with emotion regulation and gathering social support.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
中国社交媒体平台微博上基于发帖人与患者关系的不同内容和语言:关于乳腺癌帖子的文本分类、情感分析和主题建模。
背景:乳腺癌不仅影响着确诊者的生活,也影响着周围人的生活。许多患者在社交媒体上分享他们的经历。然而,这些叙述可能会因发布者的身份及其与患者的关系而有所不同;患者在发布自己的经历时可能会与朋友或家人患有乳腺癌的人发布不同的内容。微博是中国最受欢迎的社交媒体平台之一,与乳腺癌相关的帖子在微博上经常可见:为了了解中国乳腺癌患者的不同经历,我们旨在探索相关帖子的内容和语言在发帖人是谁及其与患者的关系不同的情况下有何不同,如果患者是发帖人本人还是朋友、家人、亲戚或熟人,情感表达和话题内容是否存在差异:我们以微博为资源,研究了发帖者与患者之间不同关系下的发帖差异。我们共收集了 10,322 条相关微博。我们采用两步分析法,在这一数据集上微调了 2 个中文的 "从变换器中稳健优化的双向编码器表征(BERT)预训练法 "模型,并注释了贴主与患者的关系。这些模型按顺序排列,首先是二分类器(无病人或病人),然后是多分类器(post_user、family_members、friends_relatives、acquaintances、heard_relation),对海报与病人的关系进行分类。接下来,我们使用语言学探究和字数词典从 5 个情绪类别(积极和消极情绪、愤怒、悲伤和焦虑)进行情感分析,然后进行主题建模(BERTopic):我们的二元模型(F1-score=0.92)和多类模型(F1-score=0.83)在很大程度上能够准确分类海报与患者之间的关系。随后的情感分析表明,在所有的海报-患者关系中,情感类别存在显著差异。值得注意的是,"无患者 "类别的负面情绪和愤怒情绪较高,而 "家庭成员 "类别的悲伤和焦虑情绪较高。在前 30 个主题中,我们还注意到 "非患者 "类别中关于对癌症的恐惧和愤怒的主题较多,而 "家庭成员 "类别中关于癌症治疗的主题较多:结论:中国用户会根据发帖人与发帖人之间的关系发布不同类型的内容。如果患者是家属,帖子会更悲伤、更焦虑,但也包含更多关于治疗的内容。然而,如果没有发现患者,帖子则显示出更高的愤怒程度。我们认为这可能源于发帖人的咆哮,这可能有助于情绪调节和收集社会支持。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
JMIR Cancer
JMIR Cancer ONCOLOGY-
CiteScore
4.10
自引率
0.00%
发文量
64
审稿时长
12 weeks
期刊最新文献
Oral Cancer Incidence Among Adult Males With Current or Former Use of Cigarettes or Smokeless Tobacco: Population-Based Study. Uncovering the Daily Experiences of People Living With Advanced Cancer Using an Experience Sampling Method Questionnaire: Development, Content Validation, and Optimization Study. Impact of Patient Personality on Adherence to Oral Anticancer Medications: An Opportunity? Development of an Educational Website for Patients With Cancer and Preexisting Autoimmune Diseases Considering Immune Checkpoint Blockers: Usability and Acceptability Study. An mHealth App to Support Caregivers in the Medical Management of Their Child With Cancer: Beta Stage Usability Study.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1