Proceedings of the ... International AAAI Conference on Weblogs and Social Media. International AAAI Conference on Weblogs and Social Media最新文献

Reliability Analysis of Psychological Concept Extraction and Classification in User-penned Text.

Proceedings of the ... International AAAI Conference on Weblogs and Social Media. International AAAI Conference on Weblogs and Social Media

Pub Date : 2024-05-31 Epub Date: 2024-05-28 DOI: 10.1609/icwsm.v18i1.31324

Muskan Garg, Msvpj Sathvik, Shaina Raza, Amrit Chadha, Sunghwan Sohn

The social NLP research community witness a recent surge in the computational advancements of mental health analysis to build responsible AI models for a complex interplay between language use and self-perception. Such responsible AI models aid in quantifying the psychological concepts from user-penned texts on social media. On thinking beyond the low-level (classification) task, we advance the existing binary classification dataset, towards a higher-level task of reliability analysis through the lens of explanations, posing it as one of the safety measures. We annotate the LoST dataset to capture nuanced textual cues that suggest the presence of low self-esteem in the posts of Reddit users. We further state that the NLP models developed for determining the presence of low self-esteem, focus more on three types of textual cues: (i) Trigger: words that triggers mental disturbance, (ii) LoST indicators: text indicators emphasizing low self-esteem, and (iii) Consequences: words describing the consequences of mental disturbance. We implement existing classifiers to examine the attention mechanism in pre-trained language models (PLMs) for a domain-specific psychology-grounded task. Our findings suggest the need of shifting the focus of PLMs from Trigger and Consequences to a more comprehensive explanation, emphasizing LoST indicators while determining low self-esteem in Reddit posts.

{"title":"Reliability Analysis of Psychological Concept Extraction and Classification in User-penned Text.","authors":"Muskan Garg, Msvpj Sathvik, Shaina Raza, Amrit Chadha, Sunghwan Sohn","doi":"10.1609/icwsm.v18i1.31324","DOIUrl":"10.1609/icwsm.v18i1.31324","url":null,"abstract":"The social NLP research community witness a recent surge in the computational advancements of mental health analysis to build responsible AI models for a complex interplay between language use and self-perception. Such responsible AI models aid in quantifying the psychological concepts from user-penned texts on social media. On thinking beyond the low-level (classification) task, we advance the existing binary classification dataset, towards a higher-level task of reliability analysis through the lens of explanations, posing it as one of the safety measures. We annotate the LoST dataset to capture nuanced textual cues that suggest the presence of low self-esteem in the posts of Reddit users. We further state that the NLP models developed for determining the presence of low self-esteem, focus more on three types of textual cues: (i) Trigger: words that triggers mental disturbance, (ii) LoST indicators: text indicators emphasizing low self-esteem, and (iii) Consequences: words describing the consequences of mental disturbance. We implement existing classifiers to examine the attention mechanism in pre-trained language models (PLMs) for a domain-specific psychology-grounded task. Our findings suggest the need of shifting the focus of PLMs from Trigger and Consequences to a more comprehensive explanation, emphasizing LoST indicators while determining low self-esteem in Reddit posts.","PeriodicalId":74525,"journal":{"name":"Proceedings of the ... International AAAI Conference on Weblogs and Social Media. International AAAI Conference on Weblogs and Social Media","volume":"18 ","pages":"422-434"},"PeriodicalIF":0.0,"publicationDate":"2024-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11881108/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143568840","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Negative Associations in Word Embeddings Predict Anti-black Bias across Regions-but Only via Name Frequency. 词嵌入中的负关联预测跨地区的反黑人偏见——但仅通过名字频率。

Proceedings of the ... International AAAI Conference on Weblogs and Social Media. International AAAI Conference on Weblogs and Social Media

Pub Date : 2022-05-31 DOI: 10.1609/icwsm.v16i1.19399

Austin van Loon, Salvatore Giorgi, Robb Willer, Johannes Eichstaedt

The word embedding association test (WEAT) is an important method for measuring linguistic biases against social groups such as ethnic minorities in large text corpora. It does so by comparing the semantic relatedness of words prototypical of the groups (e.g., names unique to those groups) and attribute words (e.g., 'pleasant' and 'unpleasant' words). We show that anti-Black WEAT estimates from geo-tagged social media data at the level of metropolitan statistical areas strongly correlate with several measures of racial animus-even when controlling for sociodemographic covariates. However, we also show that every one of these correlations is explained by a third variable: the frequency of Black names in the underlying corpora relative to White names. This occurs because word embeddings tend to group positive (negative) words and frequent (rare) words together in the estimated semantic space. As the frequency of Black names on social media is strongly correlated with Black Americans' prevalence in the population, this results in spuriously high anti-Black WEAT estimates wherever few Black Americans live. This suggests that research using the WEAT to measure bias should consider term frequency, and also demonstrates the potential consequences of using black-box models like word embeddings to study human cognition and behavior.

词嵌入关联测试(WEAT)是测量大文本语料库中对少数民族等社会群体的语言偏见的重要方法。它通过比较这些群体的原型词(例如，这些群体特有的名字)和属性词(例如，“愉快的”和“不愉快的”词)的语义相关性来做到这一点。我们表明，在大都市统计区域的水平上，从地理标记的社交媒体数据中得出的反黑人WEAT估计与种族敌意的几个衡量指标密切相关——即使在控制社会人口统计协变量的情况下也是如此。然而，我们也表明，这些相关性中的每一个都可以用第三个变量来解释:黑人名字在基础语料库中相对于白人名字的频率。这是因为词嵌入倾向于在估计的语义空间中将肯定(否定)词和频繁(罕见)词组合在一起。由于黑人名字在社交媒体上出现的频率与美国黑人在人口中的流行程度密切相关，这就导致了在美国黑人很少的地方，反黑人WEAT的估计高得令人难以置信。这表明，使用WEAT来衡量偏见的研究应该考虑术语频率，也表明了使用黑盒模型(如词嵌入)来研究人类认知和行为的潜在后果。

{"title":"Negative Associations in Word Embeddings Predict Anti-black Bias across Regions-but Only via Name Frequency.","authors":"Austin van Loon, Salvatore Giorgi, Robb Willer, Johannes Eichstaedt","doi":"10.1609/icwsm.v16i1.19399","DOIUrl":"https://doi.org/10.1609/icwsm.v16i1.19399","url":null,"abstract":"The word embedding association test (WEAT) is an important method for measuring linguistic biases against social groups such as ethnic minorities in large text corpora. It does so by comparing the semantic relatedness of words prototypical of the groups (e.g., names unique to those groups) and attribute words (e.g., 'pleasant' and 'unpleasant' words). We show that anti-Black WEAT estimates from geo-tagged social media data at the level of metropolitan statistical areas strongly correlate with several measures of racial animus-even when controlling for sociodemographic covariates. However, we also show that every one of these correlations is explained by a third variable: the frequency of Black names in the underlying corpora relative to White names. This occurs because word embeddings tend to group positive (negative) words and frequent (rare) words together in the estimated semantic space. As the frequency of Black names on social media is strongly correlated with Black Americans' prevalence in the population, this results in spuriously high anti-Black WEAT estimates wherever few Black Americans live. This suggests that research using the WEAT to measure bias should consider term frequency, and also demonstrates the potential consequences of using black-box models like word embeddings to study human cognition and behavior.","PeriodicalId":74525,"journal":{"name":"Proceedings of the ... International AAAI Conference on Weblogs and Social Media. International AAAI Conference on Weblogs and Social Media","volume":"16 ","pages":"1419-1424"},"PeriodicalIF":0.0,"publicationDate":"2022-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10147343/pdf/nihms-1842382.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9399665","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Correcting Sociodemographic Selection Biases for Population Prediction from Social Media. 从社交媒体中纠正人口预测的社会人口选择偏差。

Proceedings of the ... International AAAI Conference on Weblogs and Social Media. International AAAI Conference on Weblogs and Social Media

Pub Date : 2022-05-31

Salvatore Giorgi, Veronica E Lynn, Keshav Gupta, Farhan Ahmed, Sandra Matz, Lyle H Ungar, H Andrew Schwartz

Social media is increasingly used for large-scale population predictions, such as estimating community health statistics. However, social media users are not typically a representative sample of the intended population - a "selection bias". Within the social sciences, such a bias is typically addressed with restratification techniques, where observations are reweighted according to how under- or over-sampled their socio-demographic groups are. Yet, restratifaction is rarely evaluated for improving prediction. In this two-part study, we first evaluate standard, "out-of-the-box" restratification techniques, finding they provide no improvement and often even degraded prediction accuracies across four tasks of esimating U.S. county population health statistics from Twitter. The core reasons for degraded performance seem to be tied to their reliance on either sparse or shrunken estimates of each population's socio-demographics. In the second part of our study, we develop and evaluate Robust Poststratification, which consists of three methods to address these problems: (1) estimator redistribution to account for shrinking, as well as (2) adaptive binning and (3) informed smoothing to handle sparse socio-demographic estimates. We show that each of these methods leads to significant improvement in prediction accuracies over the standard restratification approaches. Taken together, Robust Poststratification enables state-of-the-art prediction accuracies, yielding a 53.0% increase in variance explained (R ²) in the case of surveyed life satisfaction, and a 17.8% average increase across all tasks.

社交媒体越来越多地被用于大规模人口预测，如估算社区健康统计数据。然而，社交媒体用户通常不是目标人群的代表性样本，这就是 "选择偏差"。在社会科学领域，这种偏差通常通过限制技术来解决，即根据社会人口群体样本不足或过多的程度对观察结果进行重新加权。然而，人们却很少对限制加权法是否能改善预测效果进行评估。在这项由两部分组成的研究中，我们首先评估了标准的、"开箱即用 "的restratifaction 技术，发现这些技术在从 Twitter 估算美国县级人口健康统计数据的四项任务中没有任何改进，甚至经常降低预测准确度。性能下降的核心原因似乎与它们对每个人口社会人口统计稀疏或缩减估计值的依赖有关。在研究的第二部分，我们开发并评估了稳健后分层法（Robust Poststratification），其中包括三种解决这些问题的方法：（1）估计器再分配以考虑缩减，以及（2）自适应分档和（3）知情平滑以处理稀疏的社会人口估计值。我们的研究表明，与标准限制方法相比，上述每种方法都能显著提高预测精度。综合来看，稳健后分层法实现了最先进的预测准确度，在调查生活满意度的情况下，解释方差（R 2）提高了 53.0%，在所有任务中平均提高了 17.8%。

{"title":"Correcting Sociodemographic Selection Biases for Population Prediction from Social Media.","authors":"Salvatore Giorgi, Veronica E Lynn, Keshav Gupta, Farhan Ahmed, Sandra Matz, Lyle H Ungar, H Andrew Schwartz","doi":"","DOIUrl":"","url":null,"abstract":"Social media is increasingly used for large-scale population predictions, such as estimating community health statistics. However, social media users are not typically a representative sample of the intended population - a \"selection bias\". Within the social sciences, such a bias is typically addressed with restratification techniques, where observations are reweighted according to how under- or over-sampled their socio-demographic groups are. Yet, restratifaction is rarely evaluated for improving prediction. In this two-part study, we first evaluate standard, \"out-of-the-box\" restratification techniques, finding they provide no improvement and often even degraded prediction accuracies across four tasks of esimating U.S. county population health statistics from Twitter. The core reasons for degraded performance seem to be tied to their reliance on either sparse or shrunken estimates of each population's socio-demographics. In the second part of our study, we develop and evaluate Robust Poststratification, which consists of three methods to address these problems: (1) estimator redistribution to account for shrinking, as well as (2) adaptive binning and (3) informed smoothing to handle sparse socio-demographic estimates. We show that each of these methods leads to significant improvement in prediction accuracies over the standard restratification approaches. Taken together, Robust Poststratification enables state-of-the-art prediction accuracies, yielding a 53.0% increase in variance explained (R 2) in the case of surveyed life satisfaction, and a 17.8% average increase across all tasks.","PeriodicalId":74525,"journal":{"name":"Proceedings of the ... International AAAI Conference on Weblogs and Social Media. International AAAI Conference on Weblogs and Social Media","volume":"16 1","pages":"228-240"},"PeriodicalIF":0.0,"publicationDate":"2022-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9714525/pdf/nihms-1842768.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"35254726","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Classifying Minority Stress Disclosure on Social Media with Bidirectional Long Short-Term Memory. 利用双向长短期记忆对社交媒体上的少数群体压力披露进行分类。

Proceedings of the ... International AAAI Conference on Weblogs and Social Media. International AAAI Conference on Weblogs and Social Media

Pub Date : 2022-05-31

Cory J Cascalheira, Shah Muhammad Hamdi, Jillian R Scheer, Koustuv Saha, Soukaina Filali Boubrahimi, Munmun De Choudhury

Because of their stigmatized social status, sexual and gender minority (SGM; e.g., gay, transgender) people experience minority stress (i.e., identity-based stress arising from adverse social conditions). Given that minority stress is the leading framework for understanding health inequity among SGM people, researchers and clinicians need accurate methods to detect minority stress. Since social media fulfills important developmental, affiliative, and coping functions for SGM people, social media may be an ecologically valid channel for detecting minority stress. In this paper, we propose a bidirectional long short-term memory (BI-LSTM) network for classifying minority stress disclosed on Reddit. Our experiments on a dataset of 12,645 Reddit posts resulted in an average accuracy of 65%.

性与性别少数群体（SGM，如同性恋、变性人）由于其被污名化的社会地位，会经历少数群体压力（即由不利社会条件引起的基于身份的压力）。鉴于少数群体压力是了解 SGM 健康不平等的主要框架，研究人员和临床医生需要准确的方法来检测少数群体压力。由于社交媒体对 SGM 人具有重要的发展、从属关系和应对功能，因此社交媒体可能是检测少数群体压力的生态有效渠道。在本文中，我们提出了一种双向长短期记忆（BI-LSTM）网络，用于对 Reddit 上披露的少数群体压力进行分类。我们在一个包含 12,645 个 Reddit 帖子的数据集上进行了实验，结果显示平均准确率为 65%。

引用次数: 0

Classifying Minority Stress Disclosure on Social Media with Bidirectional Long Short-Term Memory 双向长短期记忆对少数民族社交媒体压力披露的分类研究

Proceedings of the ... International AAAI Conference on Weblogs and Social Media. International AAAI Conference on Weblogs and Social Media

Pub Date : 2022-05-31 DOI: 10.1609/icwsm.v16i1.19390

C. Cascalheira, S. M. Hamdi, Jillian R. Scheer, Koustuv Saha, S. F. Boubrahimi, M. Choudhury

Because of their stigmatized social status, sexual and gender minority (SGM; e.g., gay, transgender) people experience minority stress (i.e., identity-based stress arising from adverse social conditions). Given that minority stress is the leading framework for understanding health inequity among SGM people, researchers and clinicians need accurate methods to detect minority stress. Since social media fulfills important developmental, affiliative, and coping functions for SGM people, social media may be an ecologically valid channel for detecting minority stress. In this paper, we propose a bidirectional long short-term memory (BI-LSTM) network for classifying minority stress disclosed on Reddit. Our experiments on a dataset of 12,645 Reddit posts resulted in an average accuracy of 65%.

由于他们被污名化的社会地位，性少数和性别少数(SGM;例如，同性恋，变性人)经历少数压力(即，由不利的社会条件产生的基于身份的压力)。鉴于少数群体压力是理解SGM人群健康不平等的主要框架，研究人员和临床医生需要准确的方法来检测少数群体压力。由于社交媒体履行了SGM人群重要的发展、隶属和应对功能，社交媒体可能是检测少数群体压力的生态有效渠道。在本文中，我们提出了一个双向长短期记忆(BI-LSTM)网络用于分类Reddit上披露的少数派压力。我们对12645个Reddit帖子的数据集进行了实验，结果平均准确率为65%。

引用次数: 3

Tweet Classification to Assist Human Moderation for Suicide Prevention. 推文分类协助人类适度自杀预防。

Proceedings of the ... International AAAI Conference on Weblogs and Social Media. International AAAI Conference on Weblogs and Social Media

Pub Date : 2021-06-04 Epub Date: 2021-05-22

Ramit Sawhney, Harshit Joshi, Alicia Nobles, Rajiv Ratn Shah

Social media platforms are already engaged in leveraging existing online socio-technical systems to employ just-in-time interventions for suicide prevention to the public. These efforts primarily rely on self-reports of potential self-harm content that is reviewed by moderators. Most recently, platforms have employed automated models to identify self-harm content, but acknowledge that these automated models still struggle to understand the nuance of human language (e.g., sarcasm). By explicitly focusing on Twitter posts that could easily be misidentified by a model as expressing suicidal intent (i.e., they contain similar phrases such as "wanting to die"), our work examines the temporal differences in historical expressions of general and emotional language prior to a clear expression of suicidal intent. Additionally, we analyze time-aware neural models that build on these language variants and factors in the historical, emotional spectrum of a user's tweeting activity. The strongest model achieves high (statistically significant) performance (macro F1=0.804, recall=0.813) to identify social media indicative of suicidal intent. Using three use cases of tweets with phrases common to suicidal intent, we qualitatively analyze and interpret how such models decided if suicidal intent was present and discuss how these analyses may be used to alleviate the burden on human moderators within the known constraints of how moderation is performed (e.g., no access to the user's timeline). Finally, we discuss the ethical implications of such data-driven models and inferences about suicidal intent from social media. Content warning: this article discusses self-harm and suicide.

社交媒体平台已经开始利用现有的在线社会技术系统，为公众提供及时的自杀预防干预。这些努力主要依赖于由版主审查的潜在自残内容的自我报告。最近，平台已经使用自动化模型来识别自残内容，但承认这些自动化模型仍然难以理解人类语言的细微差别(例如，讽刺)。通过明确关注可能容易被模型错误识别为表达自杀意图的Twitter帖子(即，它们包含类似的短语，如“想死”)，我们的工作检查了在明确表达自杀意图之前，一般语言和情感语言的历史表达的时间差异。此外，我们分析了建立在这些语言变体和历史因素上的时间感知神经模型，用户的推文活动的情感谱。最强的模型在识别社交媒体暗示的自杀意图方面取得了很高(统计显著)的表现(宏观F1=0.804，召回率=0.813)。使用三个带有自杀意图常见短语的推文用例，我们定性地分析和解释了这些模型如何决定是否存在自杀意图，并讨论了如何使用这些分析来减轻人类版主在如何执行审核的已知约束(例如，无法访问用户的时间轴)中的负担。最后，我们讨论了这种数据驱动模型的伦理含义，以及社交媒体对自杀意图的推断。内容警告:本文讨论自残和自杀。

{"title":"Tweet Classification to Assist Human Moderation for Suicide Prevention.","authors":"Ramit Sawhney, Harshit Joshi, Alicia Nobles, Rajiv Ratn Shah","doi":"","DOIUrl":"","url":null,"abstract":"Social media platforms are already engaged in leveraging existing online socio-technical systems to employ just-in-time interventions for suicide prevention to the public. These efforts primarily rely on self-reports of potential self-harm content that is reviewed by moderators. Most recently, platforms have employed automated models to identify self-harm content, but acknowledge that these automated models still struggle to understand the nuance of human language (e.g., sarcasm). By explicitly focusing on Twitter posts that could easily be misidentified by a model as expressing suicidal intent (i.e., they contain similar phrases such as \"wanting to die\"), our work examines the temporal differences in historical expressions of general and emotional language prior to a clear expression of suicidal intent. Additionally, we analyze time-aware neural models that build on these language variants and factors in the historical, emotional spectrum of a user's tweeting activity. The strongest model achieves high (statistically significant) performance (macro F1=0.804, recall=0.813) to identify social media indicative of suicidal intent. Using three use cases of tweets with phrases common to suicidal intent, we qualitatively analyze and interpret how such models decided if suicidal intent was present and discuss how these analyses may be used to alleviate the burden on human moderators within the known constraints of how moderation is performed (e.g., no access to the user's timeline). Finally, we discuss the ethical implications of such data-driven models and inferences about suicidal intent from social media. Content warning: this article discusses self-harm and suicide.","PeriodicalId":74525,"journal":{"name":"Proceedings of the ... International AAAI Conference on Weblogs and Social Media. International AAAI Conference on Weblogs and Social Media","volume":" ","pages":"609-620"},"PeriodicalIF":0.0,"publicationDate":"2021-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8843106/pdf/nihms-1774843.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39627521","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Well-Being Depends on Social Comparison: Hierarchical Models of Twitter Language Suggest That Richer Neighbors Make You Less Happy. 幸福取决于社会比较:推特语言的等级模型表明，更富有的邻居会让你更不快乐。

Proceedings of the ... International AAAI Conference on Weblogs and Social Media. International AAAI Conference on Weblogs and Social Media

Pub Date : 2021-01-01 DOI: 10.1609/icwsm.v15i1.18132

Salvatore Giorgi, Sharath Chandra Guntuku, Johannes C Eichstaedt, Claire Pajot, H Andrew Schwartz, Lyle H Ungar

Psychological research has shown that subjective well-being is sensitive to social comparison effects; individuals report decreased happiness when their neighbors earn more than they do. In this work, we use Twitter language to estimate the well-being of users, and model both individual and neighborhood income using hierarchical modeling across counties in the United States (US). We show that language-based estimates from a sample of 5.8 million Twitter users replicate results obtained from large-scale well-being surveys - relatively richer neighbors leads to lower well-being, even when controlling for absolute income. Furthermore, predicting individual-level happiness using hierarchical models (i.e., individuals within their communities) out-predicts standard baselines. We also explore language associated with relative income differences and find that individuals with lower income than their community tend to swear (f*ck, sh*t, b*tch), express anger (pissed, bullsh*t, wtf), hesitation (don't, anymore, idk, confused) and acts of social deviance (weed, blunt, drunk). These results suggest that social comparison robustly affects reported well-being, and that Twitter language analyses can be used to both measure these effects and shed light on their underlying psychological dynamics.

心理学研究表明，主观幸福感对社会比较效应敏感;当邻居挣得比自己多时，个人的幸福感会下降。在这项工作中，我们使用Twitter语言来估计用户的福祉，并使用分层模型对美国各县的个人和社区收入进行建模。我们从580万Twitter用户样本中得出的基于语言的估计与大规模幸福感调查的结果一致——即使在控制绝对收入的情况下，相对富裕的邻居也会导致较低的幸福感。此外，使用等级模型(即社区内的个人)预测个人层面的幸福感超出了标准基线。我们还研究了与相对收入差异相关的语言，发现收入低于社区的人倾向于咒骂(f*ck, sh*t, b*tch)，表达愤怒(pissed, bullsh*t, wtf)，犹豫(don't, more, idk, confused)和社会越界行为(weed, blunt, drunk)。这些结果表明，社会比较强烈地影响着报告的幸福感，Twitter语言分析既可以用来衡量这些影响，也可以用来揭示他们潜在的心理动态。

{"title":"Well-Being Depends on Social Comparison: Hierarchical Models of Twitter Language Suggest That Richer Neighbors Make You Less Happy.","authors":"Salvatore Giorgi, Sharath Chandra Guntuku, Johannes C Eichstaedt, Claire Pajot, H Andrew Schwartz, Lyle H Ungar","doi":"10.1609/icwsm.v15i1.18132","DOIUrl":"https://doi.org/10.1609/icwsm.v15i1.18132","url":null,"abstract":"Psychological research has shown that subjective well-being is sensitive to social comparison effects; individuals report decreased happiness when their neighbors earn more than they do. In this work, we use Twitter language to estimate the well-being of users, and model both individual and neighborhood income using hierarchical modeling across counties in the United States (US). We show that language-based estimates from a sample of 5.8 million Twitter users replicate results obtained from large-scale well-being surveys - relatively richer neighbors leads to lower well-being, even when controlling for absolute income. Furthermore, predicting individual-level happiness using hierarchical models (i.e., individuals within their communities) out-predicts standard baselines. We also explore language associated with relative income differences and find that individuals with lower income than their community tend to swear (f*ck, sh*t, b*tch), express anger (pissed, bullsh*t, wtf), hesitation (don't, anymore, idk, confused) and acts of social deviance (weed, blunt, drunk). These results suggest that social comparison robustly affects reported well-being, and that Twitter language analyses can be used to both measure these effects and shed light on their underlying psychological dynamics.","PeriodicalId":74525,"journal":{"name":"Proceedings of the ... International AAAI Conference on Weblogs and Social Media. International AAAI Conference on Weblogs and Social Media","volume":"15 ","pages":"1069-1074"},"PeriodicalIF":0.0,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10099468/pdf/nihms-1854629.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9328583","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

Examining Peer-to-Peer and Patient-Provider Interactions on a Social Media Community Facilitating Ask the Doctor Services. 研究促进 "问医生 "服务的社交媒体社区上的点对点互动和患者与医疗服务提供者之间的互动。

Proceedings of the ... International AAAI Conference on Weblogs and Social Media. International AAAI Conference on Weblogs and Social Media

Pub Date : 2020-06-01

Alicia L Nobles, Eric C Leas, Mark Dredze, John W Ayers

Ask the Doctor (AtD) services provide patients the opportunity to seek medical advice using online platforms. While these services represent a new mode of healthcare delivery, study of these online health communities and how they are used is limited. In particular, it is unknown if these platforms replicate existing barriers and biases in traditional healthcare delivery across demographic groups. We present an analysis of AskDocs, a subreddit that functions as a public AtD platform on social media. We examine the demographics of users, the health topics discussed, if biases present in offline healthcare settings exist on this platform, and how empathy is expressed in interactions between users and physicians. Our findings suggest a number of implications to enhance and support peer-to-peer and patient-provider interactions on online platforms.

问医生（AtD）服务为患者提供了利用在线平台寻求医疗建议的机会。虽然这些服务代表了一种新的医疗保健服务模式，但对这些在线健康社区及其使用方式的研究却很有限。特别是，这些平台是否复制了传统医疗保健服务在不同人口群体中存在的障碍和偏见，目前还不得而知。我们对 AskDocs 进行了分析，这是一个作为社交媒体上公共 AtD 平台的子reddit。我们研究了用户的人口统计学特征、所讨论的健康话题、线下医疗环境中存在的偏见在该平台上是否存在，以及在用户与医生的互动中如何表达同理心。我们的研究结果对加强和支持在线平台上的点对点互动和患者与医生之间的互动提出了一些建议。

引用次数: 0

Examining Peer-to-Peer and Patient-Provider Interactions on a Social Media Community Facilitating Ask the Doctor Services 在促进医生咨询服务的社交媒体社区中检查点对点和患者-提供者互动

Proceedings of the ... International AAAI Conference on Weblogs and Social Media. International AAAI Conference on Weblogs and Social Media

Pub Date : 2020-05-26 DOI: 10.1609/icwsm.v14i1.7315

A. Nobles, E. Leas, Mark Dredze, J. Ayers

Ask the Doctor (AtD) services provide patients the opportunity to seek medical advice using online platforms. While these services represent a new mode of healthcare delivery, study of these online health communities and how they are used is limited. In particular, it is unknown if these platforms replicate existing barriers and biases in traditional healthcare delivery across demographic groups. We present an analysis of AskDocs, a subreddit that functions as a public AtD platform on social media. We examine the demographics of users, the health topics discussed, if biases present in offline healthcare settings exist on this platform, and how empathy is expressed in interactions between users and physicians. Our findings suggest a number of implications to enhance and support peer-to-peer and patient-provider interactions on online platforms.

“问医生”(AtD)服务为患者提供了利用在线平台寻求医疗建议的机会。虽然这些服务代表了一种新的医疗保健提供模式，但对这些在线卫生社区及其使用方式的研究有限。特别是，目前尚不清楚这些平台是否会在人口群体中复制传统医疗保健服务中的现有障碍和偏见。我们对AskDocs进行了分析，AskDocs是reddit的一个子版块，在社交媒体上充当公共AtD平台。我们检查了用户的人口统计数据、讨论的健康主题、该平台上是否存在线下医疗保健设置中的偏见，以及用户和医生之间的互动如何表达同理心。我们的研究结果提出了一些建议，以加强和支持在线平台上的点对点和患者-提供者互动。

引用次数: 10

Correcting Sociodemographic Selection Biases for Population Prediction from Social Media 纠正社会人口选择偏差对社会媒体人口预测的影响

Proceedings of the ... International AAAI Conference on Weblogs and Social Media. International AAAI Conference on Weblogs and Social Media

Pub Date : 2019-11-10 DOI: 10.1609/icwsm.v16i1.19287

Salvatore Giorgi, Veronica E. Lynn, Keshav Gupta, F. Ahmed, S. Matz, Lyle Ungar, H. A. Schwartz

Social media is increasingly used for large-scale population predictions, such as estimating community health statistics. However, social media users are not typically a representative sample of the intended population - a "selection bias". Within the social sciences, such a bias is typically addressed with restratification techniques, where observations are reweighted according to how under- or over-sampled their socio-demographic groups are. Yet, restratifaction is rarely evaluated for improving prediction. In this two-part study, we first evaluate standard, "out-of-the-box" restratification techniques, finding they provide no improvement and often even degraded prediction accuracies across four tasks of esimating U.S. county population health statistics from Twitter. The core reasons for degraded performance seem to be tied to their reliance on either sparse or shrunken estimates of each population's socio-demographics. In the second part of our study, we develop and evaluate Robust Poststratification, which consists of three methods to address these problems: (1) estimator redistribution to account for shrinking, as well as (2) adaptive binning and (3) informed smoothing to handle sparse socio-demographic estimates. We show that each of these methods leads to significant improvement in prediction accuracies over the standard restratification approaches. Taken together, Robust Poststratification enables state-of-the-art prediction accuracies, yielding a 53.0% increase in variance explained (R 2) in the case of surveyed life satisfaction, and a 17.8% average increase across all tasks.

社交媒体越来越多地用于大规模人口预测，例如估计社区卫生统计数据。然而，社交媒体用户通常不是目标人群的代表性样本——这是一种“选择偏差”。在社会科学中，这种偏见通常是通过重新调整技术来解决的，即根据其社会人口群体的抽样不足或过度程度对观察结果进行重新加权。然而，重组很少被评估为改善预测。在这个由两部分组成的研究中，我们首先评估了标准的、“开箱即用”的重新定义技术，发现它们没有提供任何改进，甚至经常降低了从Twitter估计美国县人口健康统计数据的四项任务的预测准确性。表现下降的核心原因似乎与他们依赖于对每个人口的社会人口统计数据的稀疏或缩小的估计有关。在我们的研究的第二部分，我们开发和评估稳健后分层，它包括三种方法来解决这些问题:(1)估计量再分配，以考虑萎缩，以及(2)自适应分形和(3)平滑处理稀疏的社会人口估计。我们表明，这些方法中的每一种都比标准的重构方法显著提高了预测精度。综上所述，鲁棒后分层使最先进的预测准确性，在调查生活满意度的情况下，方差解释(r2)增加53.0%，所有任务平均增加17.8%。

{"title":"Correcting Sociodemographic Selection Biases for Population Prediction from Social Media","authors":"Salvatore Giorgi, Veronica E. Lynn, Keshav Gupta, F. Ahmed, S. Matz, Lyle Ungar, H. A. Schwartz","doi":"10.1609/icwsm.v16i1.19287","DOIUrl":"https://doi.org/10.1609/icwsm.v16i1.19287","url":null,"abstract":"Social media is increasingly used for large-scale population predictions, such as estimating community health statistics. However, social media users are not typically a representative sample of the intended population - a \"selection bias\". Within the social sciences, such a bias is typically addressed with restratification techniques, where observations are reweighted according to how under- or over-sampled their socio-demographic groups are. Yet, restratifaction is rarely evaluated for improving prediction. In this two-part study, we first evaluate standard, \"out-of-the-box\" restratification techniques, finding they provide no improvement and often even degraded prediction accuracies across four tasks of esimating U.S. county population health statistics from Twitter. The core reasons for degraded performance seem to be tied to their reliance on either sparse or shrunken estimates of each population's socio-demographics. In the second part of our study, we develop and evaluate Robust Poststratification, which consists of three methods to address these problems: (1) estimator redistribution to account for shrinking, as well as (2) adaptive binning and (3) informed smoothing to handle sparse socio-demographic estimates. We show that each of these methods leads to significant improvement in prediction accuracies over the standard restratification approaches. Taken together, Robust Poststratification enables state-of-the-art prediction accuracies, yielding a 53.0% increase in variance explained (R 2) in the case of surveyed life satisfaction, and a 17.8% average increase across all tasks.","PeriodicalId":74525,"journal":{"name":"Proceedings of the ... International AAAI Conference on Weblogs and Social Media. International AAAI Conference on Weblogs and Social Media","volume":"41 1","pages":"228-240"},"PeriodicalIF":0.0,"publicationDate":"2019-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77499252","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 11