Pub Date : 2023-06-02DOI: 10.1609/icwsm.v17i1.22217
J. Pohl, Simon Markmann, Dennis Assenmacher, C. Grimme
Social media can be a mirror of human interaction, society, and historic disruptions. Their reach enables the global dissemination of information in the shortest possible time and, thus, the individual participation of people worldwide in global events in almost real-time. However, these platforms can be equally efficiently used in information warfare to manipulate human perception and opinion formation. Within this paper, we describe a dataset of raw tweets collected via the Twitter Streaming API in the context of the onset of the war, which Russia started in Ukraine on February 24, 2022. A distinctive feature of the dataset is that it covers the period from one week before to one week after Russia invasion of Ukraine. This paper details the acquisition process and provides first insights into the content of the data stream. In addition, the data has been annotated with availability tags, resulting from rehydration attempts at two points in time: directly after data acquisition and shortly before manuscript submission. This may provide information on Twitter moderation policies. Further, we provide a detailed list of other published dataset covering the same topic. On the content level, we can show that our dataset comprises several distinct topics related to the conflict and conspiracy narratives -- topics that deserve more profound investigation. Therefore, the presented dataset is also made available to the community in an extended version with pseudonymized tweet content upon request.
{"title":"Invasion@Ukraine: Providing and Describing a Twitter Streaming Dataset That Captures the Outbreak of War between Russia and Ukraine in 2022","authors":"J. Pohl, Simon Markmann, Dennis Assenmacher, C. Grimme","doi":"10.1609/icwsm.v17i1.22217","DOIUrl":"https://doi.org/10.1609/icwsm.v17i1.22217","url":null,"abstract":"Social media can be a mirror of human interaction, society, and historic disruptions. Their reach enables the global dissemination of information in the shortest possible time and, thus, the individual participation of people worldwide in global events in almost real-time. However, these platforms can be equally efficiently used in information warfare to manipulate human perception and opinion formation. \u0000Within this paper, we describe a dataset of raw tweets collected via the Twitter Streaming API in the context of the onset of the war, which Russia started in Ukraine on February 24, 2022. A distinctive feature of the dataset is that it covers the period from one week before to one week after Russia invasion of Ukraine. This paper details the acquisition process and provides first insights into the content of the data stream. In addition, the data has been annotated with availability tags, resulting from rehydration attempts at two points in time: directly after data acquisition and shortly before manuscript submission. This may provide information on Twitter moderation policies. Further, we provide a detailed list of other published dataset covering the same topic. On the content level, we can show that our dataset comprises several distinct topics related to the conflict and conspiracy narratives -- topics that deserve more profound investigation. Therefore, the presented dataset is also made available to the community in an extended version with pseudonymized tweet content upon request.","PeriodicalId":175641,"journal":{"name":"International Conference on Web and Social Media","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129477371","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-06-02DOI: 10.1609/icwsm.v17i1.22214
Brooke Perreault, Lan Dau, Anya Wintner, Eni Mustafaraj
How do Google Search results change following an impactful real-world event, such as the U.S. Supreme Court decision on June 24, 2022 to overturn Roe v. Wade? And what do they tell us about the nature of event-driven content, generated by various participants in the online information environment? In this paper, we present a dataset of more than 1.74 million Google Search results pages collected between June 24 and July 17, 2022, intended to capture what Google Search surfaced in response to queries about this event of national importance. These search pages were collected for 65 locations in 13 U.S. states, a mix of red, blue, and purple states, with respect to their voting patterns. We describe the process of building a set of circa 1,700 phrases used for searching Google, how we gathered the search results for each location, and how these results were parsed to extract information about the most frequently encountered web domains. We believe that this dataset, which comprises raw data (search results as HTML files) and processed data (extracted links organized as CSV files) can be used to answer research questions that are of interest to computational social scientists as well as communication and media studies scholars.
在现实世界发生重大事件后,比如美国最高法院于2022年6月24日推翻罗伊诉韦德案(Roe v. Wade),谷歌搜索结果会发生怎样的变化?它们告诉我们,由在线信息环境中的各种参与者产生的事件驱动内容的本质是什么?在本文中,我们展示了一个收集于2022年6月24日至7月17日之间的超过174万个谷歌搜索结果页面的数据集,旨在捕捉谷歌搜索在响应有关这一国家重要事件的查询时所显示的内容。这些搜索页面是从美国13个州的65个地点收集的,这些州混合了红州、蓝州和紫州的投票模式。我们描述了构建用于搜索Google的大约1,700个短语的过程,我们如何收集每个位置的搜索结果,以及如何解析这些结果以提取有关最常遇到的web域的信息。我们相信,这个包含原始数据(HTML文件形式的搜索结果)和处理数据(以CSV文件组织的提取链接)的数据集可以用来回答计算社会科学家以及传播和媒体研究学者感兴趣的研究问题。
{"title":"Capturing the Aftermath of the Dobbs v. Jackson Women's Health Organization Decision in Google Search Results across the U.S","authors":"Brooke Perreault, Lan Dau, Anya Wintner, Eni Mustafaraj","doi":"10.1609/icwsm.v17i1.22214","DOIUrl":"https://doi.org/10.1609/icwsm.v17i1.22214","url":null,"abstract":"How do Google Search results change following an impactful real-world event, such as the U.S. Supreme Court decision on June 24, 2022 to overturn Roe v. Wade? And what do they tell us about the nature of event-driven content, generated by various participants in the online information environment? In this paper, we present a dataset of more than 1.74 million Google Search results pages collected between June 24 and July 17, 2022, intended to capture what Google Search surfaced in response to queries about this event of national importance. These search pages were collected for 65 locations in 13 U.S. states, a mix of red, blue, and purple states, with respect to their voting patterns. We describe the process of building a set of circa 1,700 phrases used for searching Google, how we gathered the search results for each location, and how these results were parsed to extract information about the most frequently encountered web domains. We believe that this dataset, which comprises raw data (search results as HTML files) and processed data (extracted links organized as CSV files) can be used to answer research questions that are of interest to computational social scientists as well as communication and media studies scholars.","PeriodicalId":175641,"journal":{"name":"International Conference on Web and Social Media","volume":"175 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124318722","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-06-02DOI: 10.1609/icwsm.v17i1.22177
Lillio Mok, M. Inzlicht, Ashton Anderson
Online social platforms afford users vast digital spaces to share and discuss current events. However, scholars have concerns both over their role in segregating information exchange into ideological echo chambers, and over evidence that these echo chambers are nonetheless over-stated. In this work, we investigate news-sharing patterns across the entirety of Reddit and find that the platform appears polarized macroscopically, especially in politically right-leaning spaces. On closer examination, however, we observe that the majority of this effect originates from small, hyper-partisan segments of the platform accounting for a minority of news shared. We further map the temporal evolution of polarized news sharing and uncover evidence that, in addition to having grown drastically over time, polarization in hyper-partisan communities also began much earlier than 2016 and is resistant to Reddit's largest moderation event. Our results therefore suggest that social polarized news sharing runs narrow but deep online. Rather than being guided by the general prevalence or absence of echo chambers, we argue that platform policies are better served by measuring and targeting the communities in which ideological segregation is strongest.
{"title":"Echo Tunnels: Polarized News Sharing Online Runs Narrow but Deep","authors":"Lillio Mok, M. Inzlicht, Ashton Anderson","doi":"10.1609/icwsm.v17i1.22177","DOIUrl":"https://doi.org/10.1609/icwsm.v17i1.22177","url":null,"abstract":"Online social platforms afford users vast digital spaces to share and discuss current events. However, scholars have concerns both over their role in segregating information exchange into ideological echo chambers, and over evidence that these echo chambers are nonetheless over-stated. In this work, we investigate news-sharing patterns across the entirety of Reddit and find that the platform appears polarized macroscopically, especially in politically right-leaning spaces. On closer examination, however, we observe that the majority of this effect originates from small, hyper-partisan segments of the platform accounting for a minority of news shared. We further map the temporal evolution of polarized news sharing and uncover evidence that, in addition to having grown drastically over time, polarization in hyper-partisan communities also began much earlier than 2016 and is resistant to Reddit's largest moderation event. Our results therefore suggest that social polarized news sharing runs narrow but deep online. Rather than being guided by the general prevalence or absence of echo chambers, we argue that platform policies are better served by measuring and targeting the communities in which ideological segregation is strongest.","PeriodicalId":175641,"journal":{"name":"International Conference on Web and Social Media","volume":"117 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123961313","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-06-02DOI: 10.1609/icwsm.v17i1.22158
Hitkul Jangra, Rajiv Shah, P. Kumaraguru
Deaths due to drug overdose in the US have doubled in the last decade. Drug-related content on social media has also exploded in the same time frame. The pseudo-anonymous nature of social media platforms enables users to discourse about taboo and sometimes illegal topics like drug consumption. User-generated content (UGC) about drugs on social media can be used as an online proxy to detect offline drug consumption. UGC also gets exposed to the praise and criticism of the community. Law of effect proposes that positive reinforcement on an experience can incentivize the users to engage in the experience repeatedly. Therefore, we hypothesize that positive community feedback on a user's online drug consumption disclosure will increase the probability of the user doing an online drug consumption disclosure post again. To this end, we collect data from 10 drug-related subreddits. First, we build a deep learning model to classify UGC as indicative of drug consumption offline or not, and analyze the extent of such activities. Further, we use matching-based causal inference techniques to unravel community feedback's effect on users' future drug consumption behavior. We discover that 84% of posts and 55% comments on drug-related subreddits indicate real-life drug consumption. Users who get positive feedback generate up to two times more drugs consumption content in the future. Finally, we conducted an anonymous user study on drug-related subreddits to compare members' opinions with our experimental findings and show that user tends to underestimate the effect community peers can have on their decision to interact with drugs.
{"title":"Effect of Feedback on Drug Consumption Disclosures on Social Media","authors":"Hitkul Jangra, Rajiv Shah, P. Kumaraguru","doi":"10.1609/icwsm.v17i1.22158","DOIUrl":"https://doi.org/10.1609/icwsm.v17i1.22158","url":null,"abstract":"Deaths due to drug overdose in the US have doubled in the last decade. Drug-related content on social media has also exploded in the same time frame. The pseudo-anonymous nature of social media platforms enables users to discourse about taboo and sometimes illegal topics like drug consumption. User-generated content (UGC) about drugs on social media can be used as an online proxy to detect offline drug consumption. UGC also gets exposed to the praise and criticism of the community. Law of effect proposes that positive reinforcement on an experience can incentivize the users to engage in the experience repeatedly. Therefore, we hypothesize that positive community feedback on a user's online drug consumption disclosure will increase the probability of the user doing an online drug consumption disclosure post again. To this end, we collect data from 10 drug-related subreddits. First, we build a deep learning model to classify UGC as indicative of drug consumption offline or not, and analyze the extent of such activities. Further, we use matching-based causal inference techniques to unravel community feedback's effect on users' future drug consumption behavior. We discover that 84% of posts and 55% comments on drug-related subreddits indicate real-life drug consumption. Users who get positive feedback generate up to two times more drugs consumption content in the future. Finally, we conducted an anonymous user study on drug-related subreddits to compare members' opinions with our experimental findings and show that user tends to underestimate the effect community peers can have on their decision to interact with drugs.","PeriodicalId":175641,"journal":{"name":"International Conference on Web and Social Media","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129730855","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-06-02DOI: 10.1609/icwsm.v17i1.22210
Joseph Gatto, Parker Seegmiller, Garrett M Johnston, Madhusudan Basak, Sarah Masud Preum
The task of extracting and classifying entities is at the core of important Health-NLP systems such as misinformation detection, medical dialogue modeling, and patient-centric information tools. Granular knowledge of textual entities allows these systems to utilize knowledge bases, retrieve relevant information, and build graphical representations of texts. Unfortunately, most existing works on health entity recognition are trained on clinical notes, which are both lexically and semantically different from public health information found in online health resources or social media. In other words, existing health entity recognizers vastly under-represent the entities relevant to public health data, such as those provided by sites like WebMD. It is crucial that future Health-NLP systems be able to model such information, as people rely on online health advice for personal health management and clinically relevant decision making. In this work, we release a new annotated dataset, HealthE, which facilitates the large-scale analysis of online textual health advice. HealthE consists of 3,400 health advice statements with token-level entity annotations. Additionally, we release 2,256 health statements which are not health advice to facilitate health advice mining. HealthE is the first dataset with an entity-recognition label space designed for the modeling of online health advice. We motivate the need for HealthE by demonstrating the limitations of five widely-used health entity recognizers on HealthE, such as those offered by Google and Amazon. We additionally benchmark three pre-trained language models on our dataset as reference for future research. All data is made publicly available.
{"title":"HealthE: Recognizing Health Advice & Entities in Online Health Communities","authors":"Joseph Gatto, Parker Seegmiller, Garrett M Johnston, Madhusudan Basak, Sarah Masud Preum","doi":"10.1609/icwsm.v17i1.22210","DOIUrl":"https://doi.org/10.1609/icwsm.v17i1.22210","url":null,"abstract":"The task of extracting and classifying entities is at the core of important Health-NLP systems such as misinformation detection, medical dialogue modeling, and patient-centric information tools. Granular knowledge of textual entities allows these systems to utilize knowledge bases, retrieve relevant information, and build graphical representations of texts. Unfortunately, most existing works on health entity recognition are trained on clinical notes, which are both lexically and semantically different from public health information found in online health resources or social media. In other words, existing health entity recognizers vastly under-represent the entities relevant to public health data, such as those provided by sites like WebMD. It is crucial that future Health-NLP systems be able to model such information, as people rely on online health advice for personal health management and clinically relevant decision making. \u0000\u0000In this work, we release a new annotated dataset, HealthE, which facilitates the large-scale analysis of online textual health advice. HealthE consists of 3,400 health advice statements with token-level entity annotations. Additionally, we release 2,256 health statements which are not health advice to facilitate health advice mining. HealthE is the first dataset with an entity-recognition label space designed for the modeling of online health advice. We motivate the need for HealthE by demonstrating the limitations of five widely-used health entity recognizers on HealthE, such as those offered by Google and Amazon. We additionally benchmark three pre-trained language models on our dataset as reference for future research. All data is made publicly available.","PeriodicalId":175641,"journal":{"name":"International Conference on Web and Social Media","volume":"64 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122979418","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-06-02DOI: 10.1609/icwsm.v17i1.22224
A. Khatua, E. Zagheni, Ingmar Weber
Extant literature has explored the social integration process of migrants settling in host communities. However, this literature typically takes a migrant-centric view, implicitly putting the burden of a successful integration on the migrant, and trying to identify the factors that lead to integration along various dimensions. In this paper, we flip this point of view by studying the attributes of natives that govern their propensity to form social ties with migrants.We do so by using anonymous and aggregate social network data provided by Facebook’s advertising platform. More specifically, we look at factors that influence the propensity for a likely-to-be non-Muslim Facebook user to have at least one social connection to a Facebook user who celebrates Ramadan. Given that, in the European context, following Islam is predominantly tied to a migration background, this gives us a lens into cross-cultural native-migrant connectivity. Our study considers demographic attributes of the host population, such as age, gender, and education level, as well as spatial variation across 30 European cities. Our findings suggest that young, educated, and male Facebook users are relatively more likely to build cross-cultural ties, compared to older, less educated, and female Facebook users. We also observe heterogeneity across the analyzed cities.
{"title":"Host-Centric Social Connectedness of Migrants in Europe on Facebook","authors":"A. Khatua, E. Zagheni, Ingmar Weber","doi":"10.1609/icwsm.v17i1.22224","DOIUrl":"https://doi.org/10.1609/icwsm.v17i1.22224","url":null,"abstract":"Extant literature has explored the social integration process of migrants settling in host communities. However, this literature typically takes a migrant-centric view, implicitly putting the burden of a successful integration on the migrant, and trying to identify the factors that lead to integration along various dimensions. In this paper, we flip this point of view by studying the attributes of natives that govern their propensity to form social ties with migrants.We do so by using anonymous and aggregate social network data provided by Facebook’s advertising platform. More specifically, we look at factors that influence the propensity for a likely-to-be non-Muslim Facebook user to have at least one social connection to a Facebook user who celebrates Ramadan. Given that, in the European context, following Islam is predominantly tied to a migration background, this gives us a lens into cross-cultural native-migrant connectivity. Our study considers demographic attributes of the host population, such as age, gender, and education level, as well as spatial variation across 30 European cities. Our findings suggest that young, educated, and male Facebook users are relatively more likely to build cross-cultural ties, compared to older, less educated, and female Facebook users. We also observe heterogeneity across the analyzed cities.","PeriodicalId":175641,"journal":{"name":"International Conference on Web and Social Media","volume":"174 1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116700090","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-06-02DOI: 10.1609/icwsm.v17i1.22212
Shogo Matsuno, Sakae Mizuki, Takeshi Sakaki
In this study, we discuss issues in the traditional evaluation norms of trend forecasts, outline a suitable evaluation method, propose an evaluation dataset construction procedure, and publish Trend Dataset: the dataset we have created. As trend predictions often yield economic benefits, trend forecasting studies have been widely conducted. However, a consistent and systematic evaluation protocol has yet to be adopted. We consider that the desired evaluation method would address the performance of predicting which entity will trend, when a trend occurs, and how much it will trend based on a reliable indicator of the general public's recognition as a gold standard. Accordingly, we propose a dataset construction method that includes annotations for trending status (trending or non-trending), degree of trending (how well it is recognized), and the trend period corresponding to a surge in recognition rate. The proposed method uses questionnaire-based recognition rates interpolated using Internet search volume, enabling trend period annotation on a weekly timescale. The main novelty is that we survey when the respondents recognize the entities that are highly likely to have trended and those that haven't. This procedure enables a balanced collection of both trending and non-trending entities. We constructed the dataset and verified its quality. We confirmed that the interests of entities estimated using Wikipedia information enables the efficient collection of trending entities a priori. We also confirmed that the Internet search volume agrees with public recognition rate among trending entities.
{"title":"Construction of Evaluation Datasets for Trend Forecasting Studies","authors":"Shogo Matsuno, Sakae Mizuki, Takeshi Sakaki","doi":"10.1609/icwsm.v17i1.22212","DOIUrl":"https://doi.org/10.1609/icwsm.v17i1.22212","url":null,"abstract":"In this study, we discuss issues in the traditional evaluation norms of trend forecasts, outline a suitable evaluation method, propose an evaluation dataset construction procedure, and publish Trend Dataset: the dataset we have created. As trend predictions often yield economic benefits, trend forecasting studies have been widely conducted. However, a consistent and systematic evaluation protocol has yet to be adopted. We consider that the desired evaluation method would address the performance of predicting which entity will trend, when a trend occurs, and how much it will trend based on a reliable indicator of the general public's recognition as a gold standard. Accordingly, we propose a dataset construction method that includes annotations for trending status (trending or non-trending), degree of trending (how well it is recognized), and the trend period corresponding to a surge in recognition rate. The proposed method uses questionnaire-based recognition rates interpolated using Internet search volume, enabling trend period annotation on a weekly timescale. The main novelty is that we survey when the respondents recognize the entities that are highly likely to have trended and those that haven't. This procedure enables a balanced collection of both trending and non-trending entities. We constructed the dataset and verified its quality. We confirmed that the interests of entities estimated using Wikipedia information enables the efficient collection of trending entities a priori. We also confirmed that the Internet search volume agrees with public recognition rate among trending entities.","PeriodicalId":175641,"journal":{"name":"International Conference on Web and Social Media","volume":"248 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131686479","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-06-02DOI: 10.1609/icwsm.v17i1.22183
P. Morales, M. Berriche, Jean-Philippe Cointet
To understand why internet users spread fake news online, many studies have focused on individual drivers, such as cognitive skills, media literacy, or demographics. Recent findings have also shown the role of complex socio-political dynamics, highlighting that political polarization and ideologies are closely linked to a propensity to participate in the dissemination of fake news. Most of the existing empirical studies have focused on the US example by exploiting the self-reported or solicited positioning of users on a dichotomous scale opposing liberals with conservatives. Yet, left-right polarization alone is insufficient to study socio-political dynamics when considering non binary and multi-dimensional party systems, in which relevant ideological stances must be characterized in additional dimensions, relating for example to opposition to elites, government, political parties or mainstream media. In this article we leverage ideological embeddings of Twitter networks in France in multi-dimensional opinions spaces, where dimensions stand for attitudes towards different issues, and we trace the positions of users who shared articles that were rated as misinformation by fact-checkers. In multi-dimensional settings, and in contrast with the US, opinion dimensions capturing attitudes towards elites are more predictive of whether a user shares misinformation. Most users sharing misinformation hold salient anti-elite sentiments and, among them, more so those with radical left- and right-leaning stances. Our results reinforce the importance of enriching one-dimensional left-right analyses, showing that other ideological dimensions, such as anti-elite sentiment, are critical when characterizing users who spread fake news. This lends support to emerging accounts of social drivers of misinformation through political polarization, but also stresses the role of the entanglement between fake news, anti-elite polarization, and the role of scientific authorities in public debate.
{"title":"The Geometry of Misinformation: Embedding Twitter Networks of Users Who Spread Fake News in Geometrical Opinion Spaces","authors":"P. Morales, M. Berriche, Jean-Philippe Cointet","doi":"10.1609/icwsm.v17i1.22183","DOIUrl":"https://doi.org/10.1609/icwsm.v17i1.22183","url":null,"abstract":"To understand why internet users spread fake news online, many studies have focused on individual drivers, such as cognitive skills, media literacy, or demographics. Recent findings have also shown the role of complex socio-political dynamics, highlighting that political polarization and ideologies are closely linked to a propensity to participate in the dissemination of fake news. Most of the existing empirical studies have focused on the US example by exploiting the self-reported or solicited positioning of users on a dichotomous scale opposing liberals with conservatives. Yet, left-right polarization alone is insufficient to study socio-political dynamics when considering non binary and multi-dimensional party systems, in which relevant ideological stances must be characterized in additional dimensions, relating for example to opposition to elites, government, political parties or mainstream media. In this article we leverage ideological embeddings of Twitter networks in France in multi-dimensional opinions spaces, where dimensions stand for attitudes towards different issues, and we trace the positions of users who shared articles that were rated as misinformation by fact-checkers. In multi-dimensional settings, and in contrast with the US, opinion dimensions capturing attitudes towards elites are more predictive of whether a user shares misinformation. Most users sharing misinformation hold salient anti-elite sentiments and, among them, more so those with radical left- and right-leaning stances. Our results reinforce the importance of enriching one-dimensional left-right analyses, showing that other ideological dimensions, such as anti-elite sentiment, are critical when characterizing users who spread fake news. This lends support to emerging accounts of social drivers of misinformation through political polarization, but also stresses the role of the entanglement between fake news, anti-elite polarization, and the role of scientific authorities in public debate.","PeriodicalId":175641,"journal":{"name":"International Conference on Web and Social Media","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133795201","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-06-02DOI: 10.1609/icwsm.v17i1.22190
Lu Sun, F. M. Harper, Chia-Jung Lee, Vanessa Murdock, Bárbara Poblete
Online e-commerce product reviews can be highly influential in a customer's decision-making processes. Reviews often describe personal experiences with a product and provide candid opinions about a product's pros and cons. In some cases, reviewers choose to share information about themselves, just as they might do in social platforms. These descriptions are a valuable source of information about who finds a product most helpful. Customers benefit from key insights about a product from people with their same interests and sellers might use the information to better serve their customers needs. In this work, we present a comprehensive look into voluntary self-descriptive information found in public customer reviews. We analyzed what people share about themselves and how this contributes to their product opinions. We developed a taxonomy of types of self-descriptions, and a machine-learned classification model of reviews according to this taxonomy. We present new quantitative findings, and a thematic study of the perceived purpose descriptions in reviews.
{"title":"Characterizing and Identifying Socially Shared Self-Descriptions in Product Reviews","authors":"Lu Sun, F. M. Harper, Chia-Jung Lee, Vanessa Murdock, Bárbara Poblete","doi":"10.1609/icwsm.v17i1.22190","DOIUrl":"https://doi.org/10.1609/icwsm.v17i1.22190","url":null,"abstract":"Online e-commerce product reviews can be highly influential in a customer's decision-making processes.\u0000Reviews often describe personal experiences with a product and provide candid opinions about a product's pros and cons.\u0000In some cases, reviewers choose to share information about themselves, just as they might do in social platforms.\u0000These descriptions are a valuable source of information about who finds a product most helpful.\u0000Customers benefit from key insights about a product from people with their same interests and sellers might use the information to better serve their customers needs.\u0000In this work, we present a comprehensive look into voluntary self-descriptive information found in public customer reviews.\u0000We analyzed what people share about themselves and how this contributes to their product opinions.\u0000We developed a taxonomy of types of self-descriptions, and a machine-learned classification model of reviews according to this taxonomy. We present new quantitative findings, and a thematic study of the perceived purpose descriptions in reviews.","PeriodicalId":175641,"journal":{"name":"International Conference on Web and Social Media","volume":"66 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122878363","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-06-02DOI: 10.1609/icwsm.v17i1.22164
Takeshi Kurashima, Tomoharu Iwata, T. Tominaga, Shuhei Yamamoto, Hiroyuki Toda, K. Takemura
Humans make decisions based on their internal value function, and its shape is known to be distorted and biased around a point, which the research community of behavior economics refers to as the reference point. People intensify activities that come to lie within the reach of their reference point, and abstain from acts that would incur losses once they've crossed the point. However, the impact of past experiences on decision making around the reference point has not been well studied. By analyzing a long series of user-level decisions gathered from a competitive programming website, we find that history has a clear impact on user's decision making around the reference point. Past experiences can strengthen, and sometimes weaken, the decision bias around the reference point. Experiences of past difficulties can strengthen the tendency towards loss aversion after achieving the reference point. When a person crosses a reference point for the first time, the cognitive decision bias is significant. However, repeating this crossing gradually weakens the effect. We also show the value of our insights in the task of predicting user behavior. Prediction models incorporating our insights may be used for motivating people to remain more active.
{"title":"Personal History Affects Reference Points: A Case Study of Codeforces","authors":"Takeshi Kurashima, Tomoharu Iwata, T. Tominaga, Shuhei Yamamoto, Hiroyuki Toda, K. Takemura","doi":"10.1609/icwsm.v17i1.22164","DOIUrl":"https://doi.org/10.1609/icwsm.v17i1.22164","url":null,"abstract":"Humans make decisions based on their internal value function, and its shape is known to be distorted and biased around a point, which the research community of behavior economics refers to as the reference point. People intensify activities that come to lie within the reach of their reference point, and abstain from acts that would incur losses once they've crossed the point. However, the impact of past experiences on decision making around the reference point has not been well studied. By analyzing a long series of user-level decisions gathered from a competitive programming website, we find that history has a clear impact on user's decision making around the reference point. Past experiences can strengthen, and sometimes weaken, the decision bias around the reference point. Experiences of past difficulties can strengthen the tendency towards loss aversion after achieving the reference point. When a person crosses a reference point for the first time, the cognitive decision bias is significant. However, repeating this crossing gradually weakens the effect. We also show the value of our insights in the task of predicting user behavior. Prediction models incorporating our insights may be used for motivating people to remain more active.","PeriodicalId":175641,"journal":{"name":"International Conference on Web and Social Media","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132125681","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}