首页 > 最新文献

Proceedings of the International AAAI Conference on Web and Social Media最新文献

英文 中文
SciLander: Mapping the Scientific News Landscape SciLander:绘制科学新闻景观
Pub Date : 2023-06-02 DOI: 10.1609/icwsm.v17i1.22144
Maurício Gruppi, Panayiotis Smeros, Sibel Adalı, Carlos Castillo, Karl Aberer
The COVID-19 pandemic has fueled the spread of misinformation on social media and the Web as a whole. The phenomenon dubbed `infodemic' has taken the challenges of information veracity and trust to new heights by massively introducing seemingly scientific and technical elements into misleading content. Despite the existing body of work on modeling and predicting misinformation, the coverage of very complex scientific topics with inherent uncertainty and an evolving set of findings, such as COVID-19, provides many new challenges that are not easily solved by existing tools. To address these issues, we introduce SciLander, a method for learning representations of news sources reporting on science-based topics. We extract four heterogeneous indicators for the sources; two generic indicators that capture (1) the copying of news stories between sources, and (2) the use of the same terms to mean different things (semantic shift), and two scientific indicators that capture (1) the usage of jargon and (2) the stance towards specific citations. We use these indicators as signals of source agreement, sampling pairs of positive (similar) and negative (dissimilar) samples, and combine them in a unified framework to train unsupervised news source embeddings with a triplet margin loss objective. We evaluate our method on a novel COVID-19 dataset containing nearly 1M news articles from 500 sources spanning a period of 18 months since the beginning of the pandemic in 2020. Our results show that the features learned by our model outperform state-of-the-art baseline methods on the task of news veracity classification. Furthermore, a clustering analysis suggests that the learned representations encode information about the reliability, political leaning, and partisanship bias of these sources.
2019冠状病毒病大流行助长了社交媒体和整个网络上错误信息的传播。这种被称为“infodemic”的现象通过在误导性内容中大量引入看似科学和技术的元素,将信息真实性和可信度的挑战推向了新的高度。尽管已有大量关于建模和预测错误信息的工作,但非常复杂的科学主题(如COVID-19)具有固有的不确定性和一系列不断发展的发现,其覆盖范围带来了许多新的挑战,这些挑战无法通过现有工具轻松解决。为了解决这些问题,我们介绍了SciLander,这是一种学习基于科学主题的新闻来源报道表示的方法。我们提取了来源的四个异质指标;两个通用指标反映了(1)新闻报道在不同来源之间的复制,(2)使用相同的术语来表示不同的事物(语义转移),两个科学指标反映了(1)行话的使用,(2)对特定引用的立场。我们使用这些指标作为源一致性的信号,正(相似)和负(不相似)样本的采样对,并将它们结合在一个统一的框架中,以三元组边际损失目标训练无监督新闻源嵌入。我们在一个新的COVID-19数据集上评估了我们的方法,该数据集包含自2020年大流行开始以来18个月内来自500个来源的近100万篇新闻文章。我们的结果表明,通过我们的模型学习的特征在新闻真实性分类任务上优于最先进的基线方法。此外,聚类分析表明,学习表征编码了有关这些来源的可靠性、政治倾向和党派偏见的信息。
{"title":"SciLander: Mapping the Scientific News Landscape","authors":"Maurício Gruppi, Panayiotis Smeros, Sibel Adalı, Carlos Castillo, Karl Aberer","doi":"10.1609/icwsm.v17i1.22144","DOIUrl":"https://doi.org/10.1609/icwsm.v17i1.22144","url":null,"abstract":"The COVID-19 pandemic has fueled the spread of misinformation on social media and the Web as a whole. The phenomenon dubbed `infodemic' has taken the challenges of information veracity and trust to new heights by massively introducing seemingly scientific and technical elements into misleading content. Despite the existing body of work on modeling and predicting misinformation, the coverage of very complex scientific topics with inherent uncertainty and an evolving set of findings, such as COVID-19, provides many new challenges that are not easily solved by existing tools. To address these issues, we introduce SciLander, a method for learning representations of news sources reporting on science-based topics. We extract four heterogeneous indicators for the sources; two generic indicators that capture (1) the copying of news stories between sources, and (2) the use of the same terms to mean different things (semantic shift), and two scientific indicators that capture (1) the usage of jargon and (2) the stance towards specific citations. We use these indicators as signals of source agreement, sampling pairs of positive (similar) and negative (dissimilar) samples, and combine them in a unified framework to train unsupervised news source embeddings with a triplet margin loss objective. We evaluate our method on a novel COVID-19 dataset containing nearly 1M news articles from 500 sources spanning a period of 18 months since the beginning of the pandemic in 2020. Our results show that the features learned by our model outperform state-of-the-art baseline methods on the task of news veracity classification. Furthermore, a clustering analysis suggests that the learned representations encode information about the reliability, political leaning, and partisanship bias of these sources.","PeriodicalId":338112,"journal":{"name":"Proceedings of the International AAAI Conference on Web and Social Media","volume":"320 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136040990","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Catch Me If You Can: Deceiving Stance Detection and Geotagging Models to Protect Privacy of Individuals on Twitter 如果你能抓住我:欺骗姿态检测和地理标记模型,以保护Twitter上的个人隐私
Pub Date : 2023-06-02 DOI: 10.1609/icwsm.v17i1.22136
Dilara Dogan, Bahadir Altun, Muhammed Said Zengin, Mucahid Kutlu, Tamer Elsayed
The recent advances in natural language processing have yielded many exciting developments in text analysis and language understanding models; however, these models can also be used to track people, bringing severe privacy concerns. In this work, we investigate what individuals can do to avoid being detected by those models while using social media platforms. We ground our investigation in two exposure-risky tasks, stance detection and geotagging. We explore a variety of simple techniques for modifying text, such as inserting typos in salient words, paraphrasing, and adding dummy social media posts. Our experiments show that the performance of BERT-based models fine-tuned for stance detection decreases significantly due to typos, but it is not affected by paraphrasing. Moreover, we find that typos have minimal impact on state-of-the-art geotagging models due to their increased reliance on social networks; however, we show that users can deceive those models by interacting with different users, reducing their performance by almost 50%.
自然语言处理的最新进展在文本分析和语言理解模型方面取得了许多令人兴奋的进展;然而,这些模型也可以用来跟踪人,带来严重的隐私问题。在这项工作中,我们调查了个人在使用社交媒体平台时可以做些什么来避免被这些模型发现。我们的调查基于两个暴露风险任务,姿态检测和地理标记。我们探索了各种简单的修改文本的技术,例如在突出的单词中插入错别字,释义和添加虚拟的社交媒体帖子。我们的实验表明,基于bert的姿态检测模型的性能由于拼写错误而显著下降,但不受释义的影响。此外,我们发现错别字对最先进的地理标记模型的影响最小,因为它们越来越依赖于社交网络;然而,我们表明,用户可以通过与不同的用户交互来欺骗这些模型,从而使它们的性能降低近50%。
{"title":"Catch Me If You Can: Deceiving Stance Detection and Geotagging Models to Protect Privacy of Individuals on Twitter","authors":"Dilara Dogan, Bahadir Altun, Muhammed Said Zengin, Mucahid Kutlu, Tamer Elsayed","doi":"10.1609/icwsm.v17i1.22136","DOIUrl":"https://doi.org/10.1609/icwsm.v17i1.22136","url":null,"abstract":"The recent advances in natural language processing have yielded many exciting developments in text analysis and language understanding models; however, these models can also be used to track people, bringing severe privacy concerns. In this work, we investigate what individuals can do to avoid being detected by those models while using social media platforms. We ground our investigation in two exposure-risky tasks, stance detection and geotagging. We explore a variety of simple techniques for modifying text, such as inserting typos in salient words, paraphrasing, and adding dummy social media posts. Our experiments show that the performance of BERT-based models fine-tuned for stance detection decreases significantly due to typos, but it is not affected by paraphrasing. Moreover, we find that typos have minimal impact on state-of-the-art geotagging models due to their increased reliance on social networks; however, we show that users can deceive those models by interacting with different users, reducing their performance by almost 50%.","PeriodicalId":338112,"journal":{"name":"Proceedings of the International AAAI Conference on Web and Social Media","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136041099","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
On the Relation between Opinion Change and Information Consumption on Reddit 论Reddit上的观点变化与信息消费的关系
Pub Date : 2023-06-02 DOI: 10.1609/icwsm.v17i1.22181
Flavio Petruzzellis, Francesco Bonchi, Gianmarco De Francisci Morales, Corrado Monti
While much attention has been devoted to the causes of opinion change, little is known about its consequences. Our study moves a first step in this direction by looking at Reddit, and in particular to the subreddit r/ChangeMyView, a community dedicated to debating one’s own opinions on a wide array of topics. We analyze changes in online information consumption behavior that arise after a self-reported opinion change, by looking at the participation to a set of sociopolitical communities. We find that people who self-report an opinion change are significantly more likely to change their future participation in a specific subset of those communities. Specifically, there is a significant association (Pearson r = 0.46) between using propaganda-like language in a community and the increase in chances of leaving it. Comparable results (Pearson r = 0.39) hold for the opposite direction, i.e., joining these same communities. In addition, the textual content of the post associated with opinion change is indicative of which communities will be joined or left: a predictive model based only on the text of this post can pinpoint these communities with an average precision@5 of 0.20. Our results establish a link between opinion change and information consumption, and highlight how online propagandistic communities act as a first gateway to internalize a shift in one’s sociopolitical opinion.
虽然人们对舆论变化的原因关注甚多,但对其后果却知之甚少。我们的研究在这个方向上迈出了第一步,通过观察Reddit,特别是Reddit的r/ChangeMyView,一个致力于在广泛的主题上辩论自己观点的社区。我们通过观察对一系列社会政治社区的参与,分析了在自我报告的意见变化后出现的在线信息消费行为的变化。我们发现,自我报告观点改变的人更有可能改变他们未来在这些社区特定子集中的参与。具体来说,在社区中使用类似宣传的语言与离开该社区的几率增加之间存在显著关联(Pearson r = 0.46)。可比较的结果(Pearson r = 0.39)则相反,即加入这些相同的社区。此外,与观点变化相关的帖子的文本内容表明哪些社区将加入或离开:仅基于这篇文章的文本的预测模型可以精确定位这些社区,平均precision@5为0.20。我们的研究结果建立了意见变化和信息消费之间的联系,并强调了在线宣传社区如何作为内化一个人的社会政治观点转变的第一门户。
{"title":"On the Relation between Opinion Change and Information Consumption on Reddit","authors":"Flavio Petruzzellis, Francesco Bonchi, Gianmarco De Francisci Morales, Corrado Monti","doi":"10.1609/icwsm.v17i1.22181","DOIUrl":"https://doi.org/10.1609/icwsm.v17i1.22181","url":null,"abstract":"While much attention has been devoted to the causes of opinion change, little is known about its consequences. Our study moves a first step in this direction by looking at Reddit, and in particular to the subreddit r/ChangeMyView, a community dedicated to debating one’s own opinions on a wide array of topics. We analyze changes in online information consumption behavior that arise after a self-reported opinion change, by looking at the participation to a set of sociopolitical communities. We find that people who self-report an opinion change are significantly more likely to change their future participation in a specific subset of those communities. Specifically, there is a significant association (Pearson r = 0.46) between using propaganda-like language in a community and the increase in chances of leaving it. Comparable results (Pearson r = 0.39) hold for the opposite direction, i.e., joining these same communities. In addition, the textual content of the post associated with opinion change is indicative of which communities will be joined or left: a predictive model based only on the text of this post can pinpoint these communities with an average precision@5 of 0.20. Our results establish a link between opinion change and information consumption, and highlight how online propagandistic communities act as a first gateway to internalize a shift in one’s sociopolitical opinion.","PeriodicalId":338112,"journal":{"name":"Proceedings of the International AAAI Conference on Web and Social Media","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136041100","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Codes, Patterns and Shapes of Contemporary Online Antisemitism and Conspiracy Narratives – an Annotation Guide and Labeled German-Language Dataset in the Context of COVID-19 当代在线反犹太主义和阴谋叙事的代码、模式和形态——2019冠状病毒病背景下的注释指南和标记德语数据集
Pub Date : 2023-06-02 DOI: 10.1609/icwsm.v17i1.22216
Elisabeth Steffen, Helena Mihaljevic, Milena Pustet, Nyco Bischoff, Maria Do Mar Castro Varela, Yener Bayramoglu, Bahar Oghalai
Over the course of the COVID-19 pandemic, existing conspiracy theories were refreshed and new ones were created, often interwoven with antisemitic narratives, stereotypes and codes. The sheer volume of antisemitic and conspiracy theory content on the Internet makes data-driven algorithmic approaches essential for anti-discrimination organizations and researchers alike. However, the manifestation and dissemination of these two interrelated phenomena is still quite under-researched in scholarly empirical research of large text corpora. Algorithmic approaches for the detection and classification of specific contents usually require labeled datasets, annotated based on conceptually sound guidelines. While there is a growing number of datasets for the more general phenomenon of hate speech, the development of corpora and annotation guidelines for antisemitic and conspiracy content is still in its infancy, especially for languages other than English. To address this gap, we have developed an annotation guide for antisemitic and conspiracy theory online content in the context of the COVID-19 pandemic that includes working definitions, e.g. of specific forms of antisemitism such as encoded and post-Holocaust antisemitism. We use the guide to annotate a German-language dataset consisting of $sim ! 3,700$ Telegram messages sent between 03/2020 and 12/2021.
在2019冠状病毒病大流行期间,现有的阴谋论被刷新,新的阴谋论被创造出来,往往与反犹主义的叙述、刻板印象和准则交织在一起。互联网上大量的反犹主义和阴谋论内容使得数据驱动的算法方法对反歧视组织和研究人员至关重要。然而,在大文本语料库的学术实证研究中,对这两种相互关联的现象的表现和传播的研究还很不足。用于检测和分类特定内容的算法方法通常需要标记数据集,并根据概念上合理的指南进行注释。虽然针对更普遍的仇恨言论现象的数据集越来越多,但针对反犹主义和阴谋内容的语料库和注释指南的开发仍处于起步阶段,尤其是针对英语以外的语言。为了弥补这一差距,我们为2019冠状病毒病大流行背景下的反犹太主义和阴谋论在线内容制定了一份注释指南,其中包括工作定义,例如编码反犹太主义和大屠杀后反犹太主义等特定形式的反犹太主义。我们使用该指南来注释一个由$sim !在2020年3月至2021年12月之间发送的电报信息3,700美元。
{"title":"Codes, Patterns and Shapes of Contemporary Online Antisemitism and Conspiracy Narratives – an Annotation Guide and Labeled German-Language Dataset in the Context of COVID-19","authors":"Elisabeth Steffen, Helena Mihaljevic, Milena Pustet, Nyco Bischoff, Maria Do Mar Castro Varela, Yener Bayramoglu, Bahar Oghalai","doi":"10.1609/icwsm.v17i1.22216","DOIUrl":"https://doi.org/10.1609/icwsm.v17i1.22216","url":null,"abstract":"Over the course of the COVID-19 pandemic, existing conspiracy theories were refreshed and new ones were created, often interwoven with antisemitic narratives, stereotypes and codes. The sheer volume of antisemitic and conspiracy theory content on the Internet makes data-driven algorithmic approaches essential for anti-discrimination organizations and researchers alike. However, the manifestation and dissemination of these two interrelated phenomena is still quite under-researched in scholarly empirical research of large text corpora. Algorithmic approaches for the detection and classification of specific contents usually require labeled datasets, annotated based on conceptually sound guidelines. While there is a growing number of datasets for the more general phenomenon of hate speech, the development of corpora and annotation guidelines for antisemitic and conspiracy content is still in its infancy, especially for languages other than English. To address this gap, we have developed an annotation guide for antisemitic and conspiracy theory online content in the context of the COVID-19 pandemic that includes working definitions, e.g. of specific forms of antisemitism such as encoded and post-Holocaust antisemitism. We use the guide to annotate a German-language dataset consisting of $sim ! 3,700$ Telegram messages sent between 03/2020 and 12/2021.","PeriodicalId":338112,"journal":{"name":"Proceedings of the International AAAI Conference on Web and Social Media","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136041288","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Wiki-Based Communities of Interest: Demographics and Outliers 基于维基的兴趣社区:人口统计和异常值
Pub Date : 2023-06-02 DOI: 10.1609/icwsm.v17i1.22206
Hiba Arnaout, Simon Razniewski, Jeff Z. Pan
In this paper, we release data about demographic information and outliers of communities of interest. Identified from Wiki-based sources, mainly Wikidata, the data covers 7.5k communities, e.g., members of the White House Coronavirus Task Force, and 345k subjects, e.g., Deborah Birx. We describe the statistical inference methodology adopted to mine such data. We release subject-centric and group-centric datasets in JSON format, as well as a browsing interface. Finally, we forsee three areas where this dataset can be useful: in social sciences research, it provides a resource for demographic analyses; in web-scale collaborative encyclopedias, it serves as an edit recommender to fill knowledge gaps; and in web search, it offers lists of salient statements about queried subjects for higher user engagement. The dataset can be accessed at: https://doi.org/10.5281/zenodo.7410436
在本文中,我们发布了有关感兴趣社区的人口统计信息和异常值的数据。这些数据来自基于维基百科的来源,主要是维基数据,涵盖了7.5万个社区,例如白宫冠状病毒工作组的成员,以及34.5万个主题,例如黛博拉·比尔克斯。我们描述了用于挖掘此类数据的统计推断方法。我们以JSON格式发布以主题为中心和以组为中心的数据集,以及浏览界面。最后,我们预计该数据集可以在三个领域发挥作用:在社会科学研究中,它为人口统计分析提供了资源;在网络规模的协作式百科全书中,它作为编辑推荐器填补知识空白;在网络搜索中,它提供了关于查询主题的重要陈述列表,以提高用户参与度。该数据集可以访问:https://doi.org/10.5281/zenodo.7410436
{"title":"Wiki-Based Communities of Interest: Demographics and Outliers","authors":"Hiba Arnaout, Simon Razniewski, Jeff Z. Pan","doi":"10.1609/icwsm.v17i1.22206","DOIUrl":"https://doi.org/10.1609/icwsm.v17i1.22206","url":null,"abstract":"In this paper, we release data about demographic information and outliers of communities of interest. Identified from Wiki-based sources, mainly Wikidata, the data covers 7.5k communities, e.g., members of the White House Coronavirus Task Force, and 345k subjects, e.g., Deborah Birx. We describe the statistical inference methodology adopted to mine such data. We release subject-centric and group-centric datasets in JSON format, as well as a browsing interface. Finally, we forsee three areas where this dataset can be useful: in social sciences research, it provides a resource for demographic analyses; in web-scale collaborative encyclopedias, it serves as an edit recommender to fill knowledge gaps; and in web search, it offers lists of salient statements about queried subjects for higher user engagement. The dataset can be accessed at: https://doi.org/10.5281/zenodo.7410436","PeriodicalId":338112,"journal":{"name":"Proceedings of the International AAAI Conference on Web and Social Media","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136041290","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Multi-Task Model for Sentiment Aided Stance Detection of Climate Change Tweets 气候变化推文情感辅助姿态检测的多任务模型
Pub Date : 2023-06-02 DOI: 10.1609/icwsm.v17i1.22194
Apoorva Upadhyaya, Marco Fisichella, Wolfgang Nejdl
Climate change has become one of the biggest challenges of our time. Social media platforms such as Twitter play an important role in raising public awareness and spreading knowledge about the dangers of the current climate crisis. With the increasing number of campaigns and communication about climate change through social media, the information could create more awareness and reach the general public and policy makers. However, these Twitter communications lead to polarization of beliefs, opinion-dominated ideologies, and often a split into two communities of climate change deniers and believers. In this paper, we propose a framework that helps identify denier statements on Twitter and thus classifies the stance of the tweet into one of the two attitudes towards climate change (denier/believer). The sentimental aspects of Twitter data on climate change are deeply rooted in general public attitudes toward climate change. Therefore, our work focuses on learning two closely related tasks: Stance Detection and Sentiment Analysis of climate change tweets. We propose a multi-task framework that performs stance detection (primary task) and sentiment analysis (auxiliary task) simultaneously. The proposed model incorporates the feature-specific and shared-specific attention frameworks to fuse multiple features and learn the generalized features for both tasks. The experimental results show that the proposed framework increases the performance of the primary task, i.e., stance detection by benefiting from the auxiliary task, i.e., sentiment analysis compared to its uni-modal and single-task variants.
气候变化已经成为我们这个时代最大的挑战之一。Twitter等社交媒体平台在提高公众意识和传播有关当前气候危机危险的知识方面发挥着重要作用。随着越来越多的关于气候变化的活动和通过社交媒体进行的交流,这些信息可以创造更多的意识,并接触到普通公众和决策者。然而,这些推特上的交流导致了信仰的两极分化,舆论主导的意识形态,并经常分裂成气候变化否认者和信仰者两个群体。在本文中,我们提出了一个框架,有助于识别推特上的否认者陈述,从而将推特的立场分类为对气候变化的两种态度之一(否认者/信徒)。Twitter关于气候变化的数据的情感层面深深植根于公众对气候变化的普遍态度。因此,我们的工作重点是学习两个密切相关的任务:气候变化推文的立场检测和情感分析。我们提出了一个同时执行姿态检测(主任务)和情感分析(辅助任务)的多任务框架。该模型结合了特征特定和共享特定的注意力框架,融合了多个特征,并学习了两个任务的广义特征。实验结果表明,与单模态和单任务变体相比,所提出的框架通过受益于辅助任务(即情感分析)来提高主任务(即姿态检测)的性能。
{"title":"A Multi-Task Model for Sentiment Aided Stance Detection of Climate Change Tweets","authors":"Apoorva Upadhyaya, Marco Fisichella, Wolfgang Nejdl","doi":"10.1609/icwsm.v17i1.22194","DOIUrl":"https://doi.org/10.1609/icwsm.v17i1.22194","url":null,"abstract":"Climate change has become one of the biggest challenges of our time. Social media platforms such as Twitter play an important role in raising public awareness and spreading knowledge about the dangers of the current climate crisis. With the increasing number of campaigns and communication about climate change through social media, the information could create more awareness and reach the general public and policy makers. However, these Twitter communications lead to polarization of beliefs, opinion-dominated ideologies, and often a split into two communities of climate change deniers and believers. In this paper, we propose a framework that helps identify denier statements on Twitter and thus classifies the stance of the tweet into one of the two attitudes towards climate change (denier/believer). The sentimental aspects of Twitter data on climate change are deeply rooted in general public attitudes toward climate change. Therefore, our work focuses on learning two closely related tasks: Stance Detection and Sentiment Analysis of climate change tweets. We propose a multi-task framework that performs stance detection (primary task) and sentiment analysis (auxiliary task) simultaneously. The proposed model incorporates the feature-specific and shared-specific attention frameworks to fuse multiple features and learn the generalized features for both tasks. The experimental results show that the proposed framework increases the performance of the primary task, i.e., stance detection by benefiting from the auxiliary task, i.e., sentiment analysis compared to its uni-modal and single-task variants.","PeriodicalId":338112,"journal":{"name":"Proceedings of the International AAAI Conference on Web and Social Media","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136040985","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
VaxxHesitancy: A Dataset for Studying Hesitancy towards COVID-19 Vaccination on Twitter vaxx犹豫不决:一个研究推特上对COVID-19疫苗接种犹豫不决的数据集
Pub Date : 2023-06-02 DOI: 10.1609/icwsm.v17i1.22213
Yida Mu, Mali Jin, Charlie Grimshaw, Carolina Scarton, Kalina Bontcheva, Xingyi Song
Vaccine hesitancy has been a common concern, probably since vaccines were created and, with the popularisation of social media, people started to express their concerns about vaccines online alongside those posting pro- and anti-vaccine content. Predictably, since the first mentions of a COVID-19 vaccine, social media users posted about their fears and concerns or about their support and belief into the effectiveness of these rapidly developing vaccines. Identifying and understanding the reasons behind public hesitancy towards COVID-19 vaccines is important for policy markers that need to develop actions to better inform the population with the aim of increasing vaccine take-up. In the case of COVID-19, where the fast development of the vaccines was mirrored closely by growth in anti-vaxx disinformation, automatic means of detecting citizen attitudes towards vaccination became necessary. This is an important computational social sciences task that requires data analysis in order to gain in-depth understanding of the phenomena at hand. Annotated data is also necessary for training data-driven models for more nuanced analysis of attitudes towards vaccination. To this end, we created a new collection of over 3,101 tweets annotated with users' attitudes towards COVID-19 vaccination (stance). Besides, we also develop a domain-specific language model (VaxxBERT) that achieves the best predictive performance (73.0 accuracy and 69.3 F1-score) as compared to a robust set of baselines. To the best of our knowledge, these are the first dataset and model that model vaccine hesitancy as a category distinct from pro- and anti-vaccine stance.
疫苗犹豫一直是一个普遍的问题,可能自从疫苗发明以来,随着社交媒体的普及,人们开始在网上表达他们对疫苗的担忧,同时发布支持和反对疫苗的内容。可以预见的是,自从第一次提到COVID-19疫苗以来,社交媒体用户发布了他们的恐惧和担忧,或者他们对这些快速发展的疫苗的有效性的支持和信念。确定和了解公众对COVID-19疫苗犹豫不决背后的原因,对于需要制定行动以更好地告知民众以提高疫苗接种率的政策制定者来说非常重要。在COVID-19的情况下,疫苗的快速发展与反vaxx虚假信息的增长密切相关,因此有必要采用自动手段检测公民对疫苗接种的态度。这是一项重要的计算社会科学任务,需要数据分析,以便深入了解手头的现象。带注释的数据对于训练数据驱动的模型也是必要的,以便更细致地分析对疫苗接种的态度。为此,我们创建了一个包含3101多条推文的新集合,其中标注了用户对COVID-19疫苗接种的态度(立场)。此外,我们还开发了一个领域特定的语言模型(VaxxBERT),与一组稳健的基线相比,该模型实现了最佳的预测性能(准确率为73.0,f1得分为69.3)。据我们所知,这是第一个将疫苗犹豫作为不同于支持和反对疫苗立场的类别进行建模的数据集和模型。
{"title":"VaxxHesitancy: A Dataset for Studying Hesitancy towards COVID-19 Vaccination on Twitter","authors":"Yida Mu, Mali Jin, Charlie Grimshaw, Carolina Scarton, Kalina Bontcheva, Xingyi Song","doi":"10.1609/icwsm.v17i1.22213","DOIUrl":"https://doi.org/10.1609/icwsm.v17i1.22213","url":null,"abstract":"Vaccine hesitancy has been a common concern, probably since vaccines were created and, with the popularisation of social media, people started to express their concerns about vaccines online alongside those posting pro- and anti-vaccine content. Predictably, since the first mentions of a COVID-19 vaccine, social media users posted about their fears and concerns or about their support and belief into the effectiveness of these rapidly developing vaccines. Identifying and understanding the reasons behind public hesitancy towards COVID-19 vaccines is important for policy markers that need to develop actions to better inform the population with the aim of increasing vaccine take-up. In the case of COVID-19, where the fast development of the vaccines was mirrored closely by growth in anti-vaxx disinformation, automatic means of detecting citizen attitudes towards vaccination became necessary. This is an important computational social sciences task that requires data analysis in order to gain in-depth understanding of the phenomena at hand. Annotated data is also necessary for training data-driven models for more nuanced analysis of attitudes towards vaccination. To this end, we created a new collection of over 3,101 tweets annotated with users' attitudes towards COVID-19 vaccination (stance). Besides, we also develop a domain-specific language model (VaxxBERT) that achieves the best predictive performance (73.0 accuracy and 69.3 F1-score) as compared to a robust set of baselines. To the best of our knowledge, these are the first dataset and model that model vaccine hesitancy as a category distinct from pro- and anti-vaccine stance.","PeriodicalId":338112,"journal":{"name":"Proceedings of the International AAAI Conference on Web and Social Media","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136041096","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Analyzing the Engagement of Social Relationships during Life Event Shocks in Social Media 社交媒体中生活事件冲击中的社会关系参与分析
Pub Date : 2023-06-02 DOI: 10.1609/icwsm.v17i1.22134
Minje Choi, David Jurgens, Daniel M. Romero
Individuals experiencing unexpected distressing events, shocks, often rely on their social network for support. While prior work has shown how social networks respond to shocks, these studies usually treat all ties equally, despite differences in the support provided by different social relationships. Here, we conduct a computational analysis on Twitter that examines how responses to online shocks differ by the relationship type of a user dyad. We introduce a new dataset of over 13K instances of individuals' self-reporting shock events on Twitter and construct networks of relationship-labeled dyadic interactions around these events. By examining behaviors across 110K replies to shocked users in a pseudo-causal analysis, we demonstrate relationship-specific patterns in response levels and topic shifts. We also show that while well-established social dimensions of closeness such as tie strength and structural embeddedness contribute to shock responsiveness, the degree of impact is highly dependent on relationship and shock types. Our findings indicate that social relationships contain highly distinctive characteristics in network interactions, and that relationship-specific behaviors in online shock responses are unique from those of offline settings.
个人在经历意外的痛苦事件、冲击时,往往依赖于他们的社会网络来获得支持。虽然之前的研究已经展示了社会网络对冲击的反应,但这些研究通常平等对待所有的关系,尽管不同的社会关系提供的支持是不同的。在这里,我们对Twitter进行了计算分析,以检验用户对在线冲击的反应如何因关系类型而异。我们引入了一个新的数据集,其中包含超过13K个Twitter上个人自我报告的震惊事件实例,并围绕这些事件构建了关系标记的二元互动网络。通过在伪因果分析中检查对震惊用户的11万次回复的行为,我们展示了响应水平和话题转移的关系特定模式。我们还表明,虽然建立良好的社会亲密度(如纽带强度和结构嵌入性)有助于冲击反应,但影响程度高度依赖于关系和冲击类型。我们的研究结果表明,社交关系在网络互动中包含了高度独特的特征,并且在线冲击反应中的特定关系行为与离线环境中的行为是独特的。
{"title":"Analyzing the Engagement of Social Relationships during Life Event Shocks in Social Media","authors":"Minje Choi, David Jurgens, Daniel M. Romero","doi":"10.1609/icwsm.v17i1.22134","DOIUrl":"https://doi.org/10.1609/icwsm.v17i1.22134","url":null,"abstract":"Individuals experiencing unexpected distressing events, shocks, often rely on their social network for support. While prior work has shown how social networks respond to shocks, these studies usually treat all ties equally, despite differences in the support provided by different social relationships. Here, we conduct a computational analysis on Twitter that examines how responses to online shocks differ by the relationship type of a user dyad. We introduce a new dataset of over 13K instances of individuals' self-reporting shock events on Twitter and construct networks of relationship-labeled dyadic interactions around these events. By examining behaviors across 110K replies to shocked users in a pseudo-causal analysis, we demonstrate relationship-specific patterns in response levels and topic shifts. We also show that while well-established social dimensions of closeness such as tie strength and structural embeddedness contribute to shock responsiveness, the degree of impact is highly dependent on relationship and shock types. Our findings indicate that social relationships contain highly distinctive characteristics in network interactions, and that relationship-specific behaviors in online shock responses are unique from those of offline settings.","PeriodicalId":338112,"journal":{"name":"Proceedings of the International AAAI Conference on Web and Social Media","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136041104","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Partisan US News Media Representations of Syrian Refugees 美国新闻媒体对叙利亚难民的报道
Pub Date : 2023-06-02 DOI: 10.1609/icwsm.v17i1.22130
Keyu Chen, Marzieh Babaeianjelodar, Yiwen Shi, Kamila Janmohamed, Rupak Sarkar, Ingmar Weber, Thomas Davidson, Munmun De Choudhury, Jonathan Huang, Shweta Yadav, Ashiqur KhudaBukhsh, Chris T Bauch, Preslav Nakov, Orestis Papakyriakopoulos, Koustuv Saha, Kaveh Khoshnood, Navin Kumar
We investigate how representations of Syrian refugees (2011-2021) differ across US partisan news outlets. We analyze 47,388 articles from the online US media about Syrian refugees to detail differences in reporting between left- and right-leaning media. We use various NLP techniques to understand these differences. Our polarization and question answering results indicated that left-leaning media tended to represent refugees as child victims, welcome in the US, and right-leaning media cast refugees as Islamic terrorists. We noted similar results with our sentiment and offensive speech scores over time, which detail possibly unfavorable representations of refugees in right-leaning media. A strength of our work is how the different techniques we have applied validate each other. Based on our results, we provide several recommendations. Stakeholders may utilize our findings to intervene around refugee representations, and design communications campaigns that improve the way society sees refugees and possibly aid refugee outcomes.
我们调查了叙利亚难民的代表(2011-2021)在美国党派新闻媒体上的差异。我们分析了47388篇来自美国在线媒体关于叙利亚难民的文章,详细分析了左倾和右倾媒体在报道上的差异。我们使用各种NLP技术来理解这些差异。我们的两极分化和问答结果表明,左倾媒体倾向于将难民描述为儿童受害者,欢迎美国,右倾媒体将难民描述为伊斯兰恐怖分子。随着时间的推移,我们的情绪和攻击性言论得分也出现了类似的结果,这些得分详细描述了右倾媒体对难民可能不利的表述。我们工作的一个优势是我们所应用的不同技术如何相互验证。基于我们的研究结果,我们提出了几点建议。利益相关者可以利用我们的研究结果来干预难民代表,并设计宣传活动,以改善社会看待难民的方式,并可能帮助难民的结果。
{"title":"Partisan US News Media Representations of Syrian Refugees","authors":"Keyu Chen, Marzieh Babaeianjelodar, Yiwen Shi, Kamila Janmohamed, Rupak Sarkar, Ingmar Weber, Thomas Davidson, Munmun De Choudhury, Jonathan Huang, Shweta Yadav, Ashiqur KhudaBukhsh, Chris T Bauch, Preslav Nakov, Orestis Papakyriakopoulos, Koustuv Saha, Kaveh Khoshnood, Navin Kumar","doi":"10.1609/icwsm.v17i1.22130","DOIUrl":"https://doi.org/10.1609/icwsm.v17i1.22130","url":null,"abstract":"We investigate how representations of Syrian refugees (2011-2021) differ across US partisan news outlets. We analyze 47,388 articles from the online US media about Syrian refugees to detail differences in reporting between left- and right-leaning media. We use various NLP techniques to understand these differences. Our polarization and question answering results indicated that left-leaning media tended to represent refugees as child victims, welcome in the US, and right-leaning media cast refugees as Islamic terrorists. We noted similar results with our sentiment and offensive speech scores over time, which detail possibly unfavorable representations of refugees in right-leaning media. A strength of our work is how the different techniques we have applied validate each other. Based on our results, we provide several recommendations. Stakeholders may utilize our findings to intervene around refugee representations, and design communications campaigns that improve the way society sees refugees and possibly aid refugee outcomes.","PeriodicalId":338112,"journal":{"name":"Proceedings of the International AAAI Conference on Web and Social Media","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136041287","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
We Are in This Together: Quantifying Community Subjective Wellbeing and Resilience 我们在一起:量化社区主观幸福感和弹性
Pub Date : 2023-06-02 DOI: 10.1609/icwsm.v17i1.22137
MeiXing Dong, Ruixuan Sun, Laura Biester, Rada Mihalcea
The COVID-19 pandemic disrupted everyone's life across the world. In this work, we characterize the subjective wellbeing patterns of 112 cities across the United States during the pandemic prior to vaccine availability, as exhibited in subreddits corresponding to the cities. We quantify subjective wellbeing using positive and negative affect. We then measure the pandemic's impact by comparing a community's observed wellbeing with its expected wellbeing, as forecasted by time series models derived from prior to the pandemic. We show that general community traits reflected in language can be predictive of community resilience. We predict how the pandemic would impact the wellbeing of each community based on linguistic and interaction features from normal times before the pandemic. We find that communities with interaction characteristics corresponding to more closely connected users and higher engagement were less likely to be significantly impacted. Notably, we find that communities that talked more about social ties normally experienced in-person, such as friends, family, and affiliations, were actually more likely to be impacted. Additionally, we use the same features to also predict how quickly each community would recover after the initial onset of the pandemic. We similarly find that communities that talked more about family, affiliations, and identifying as part of a group had a slower recovery.
新冠肺炎疫情扰乱了全世界每个人的生活。在这项工作中,我们描述了在疫苗可用之前,美国112个城市在大流行期间的主观幸福感模式,如相应城市的子reddit所示。我们用积极和消极影响来量化主观幸福感。然后,我们通过比较一个社区观察到的福祉与其预期的福祉来衡量大流行的影响,这是由大流行之前得出的时间序列模型预测的。我们表明,语言中反映的一般社区特征可以预测社区弹性。我们根据疫情前正常时期的语言和互动特征,预测疫情将如何影响每个社区的福祉。我们发现,与用户联系更紧密、参与度更高的互动特征相对应的社区不太可能受到显著影响。值得注意的是,我们发现那些经常谈论社会关系的社区,比如朋友、家人和附属机构,实际上更有可能受到影响。此外,我们还使用相同的特征来预测每个社区在大流行最初爆发后的恢复速度。我们同样发现,更多地谈论家庭、关系和作为群体的一部分的社区恢复得更慢。
{"title":"We Are in This Together: Quantifying Community Subjective Wellbeing and Resilience","authors":"MeiXing Dong, Ruixuan Sun, Laura Biester, Rada Mihalcea","doi":"10.1609/icwsm.v17i1.22137","DOIUrl":"https://doi.org/10.1609/icwsm.v17i1.22137","url":null,"abstract":"The COVID-19 pandemic disrupted everyone's life across the world. In this work, we characterize the subjective wellbeing patterns of 112 cities across the United States during the pandemic prior to vaccine availability, as exhibited in subreddits corresponding to the cities. We quantify subjective wellbeing using positive and negative affect. We then measure the pandemic's impact by comparing a community's observed wellbeing with its expected wellbeing, as forecasted by time series models derived from prior to the pandemic. We show that general community traits reflected in language can be predictive of community resilience. We predict how the pandemic would impact the wellbeing of each community based on linguistic and interaction features from normal times before the pandemic. We find that communities with interaction characteristics corresponding to more closely connected users and higher engagement were less likely to be significantly impacted. Notably, we find that communities that talked more about social ties normally experienced in-person, such as friends, family, and affiliations, were actually more likely to be impacted. Additionally, we use the same features to also predict how quickly each community would recover after the initial onset of the pandemic. We similarly find that communities that talked more about family, affiliations, and identifying as part of a group had a slower recovery.","PeriodicalId":338112,"journal":{"name":"Proceedings of the International AAAI Conference on Web and Social Media","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136041291","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Proceedings of the International AAAI Conference on Web and Social Media
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1