Pub Date : 2024-03-18DOI: 10.1609/icwsm.v13i01.22003
Matteo Zignani, C. Quadri, Alessia Galdeman, S. Gaito, G. P. Rossi
This Statement of Removal refers to: Mastodon Content Warnings: Inappropriate Contents in a Microblogging Platform
本移除声明涉及 Mastodon 内容警告:微博平台中的不当内容
{"title":"Statement of Removal","authors":"Matteo Zignani, C. Quadri, Alessia Galdeman, S. Gaito, G. P. Rossi","doi":"10.1609/icwsm.v13i01.22003","DOIUrl":"https://doi.org/10.1609/icwsm.v13i01.22003","url":null,"abstract":"This Statement of Removal refers to: \u0000Mastodon Content Warnings: Inappropriate Contents in a Microblogging Platform","PeriodicalId":338112,"journal":{"name":"Proceedings of the International AAAI Conference on Web and Social Media","volume":"311 5","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140232999","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-06-02DOI: 10.1609/icwsm.v17i1.22150
Hans W. A. Hanley, Deepak Kumar, Zakir Durumeric
The coverage of the Russian invasion of Ukraine has varied widely between Western, Russian, and Chinese media ecosystems with propaganda, disinformation, and narrative spins present in all three. By utilizing the normalized pointwise mutual information metric, differential sentiment analysis, word2vec models, and partially labeled Dirichlet allocation, we present a quantitative analysis of the differences in coverage amongst these three news ecosystems. We find that while the Western press outlets have focused on the military and humanitarian aspects of the war, Russian media have focused on the purported justifications for the “special military operation” such as the presence in Ukraine of “bio-weapons” and “neo-nazis”, and Chinese news media have concentrated on the conflict’s diplomatic and economic consequences. Detecting the presence of several Russian disinformation narratives in the articles of several Chinese media outlets, we finally measure the degree to which Russian media has influenced Chinese coverage across Chinese outlets’ news articles, Weibo accounts, and Twitter accounts. Our analysis indicates that since the Russian invasion of Ukraine, Chinese state media outlets have increasingly cited Russian outlets as news sources and spread Russian disinformation narratives.
{"title":"\"A Special Operation\": A Quantitative Approach to Dissecting and Comparing Different Media Ecosystems’ Coverage of the Russo-Ukrainian War","authors":"Hans W. A. Hanley, Deepak Kumar, Zakir Durumeric","doi":"10.1609/icwsm.v17i1.22150","DOIUrl":"https://doi.org/10.1609/icwsm.v17i1.22150","url":null,"abstract":"The coverage of the Russian invasion of Ukraine has varied widely between Western, Russian, and Chinese media ecosystems with propaganda, disinformation, and narrative spins present in all three. By utilizing the normalized pointwise mutual information metric, differential sentiment analysis, word2vec models, and partially labeled Dirichlet allocation, we present a quantitative analysis of the differences in coverage amongst these three news ecosystems. We find that while the Western press outlets have focused on the military and humanitarian aspects of the war, Russian media have focused on the purported justifications for the “special military operation” such as the presence in Ukraine of “bio-weapons” and “neo-nazis”, and Chinese news media have concentrated on the conflict’s diplomatic and economic consequences. Detecting the presence of several Russian disinformation narratives in the articles of several Chinese media outlets, we finally measure the degree to which Russian media has influenced Chinese coverage across Chinese outlets’ news articles, Weibo accounts, and Twitter accounts. Our analysis indicates that since the Russian invasion of Ukraine, Chinese state media outlets have increasingly cited Russian outlets as news sources and spread Russian disinformation narratives.","PeriodicalId":338112,"journal":{"name":"Proceedings of the International AAAI Conference on Web and Social Media","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135910221","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-06-02DOI: 10.1609/icwsm.v17i1.22152
Tiancheng Hu, Manoel Horta Ribeiro, Robert West, Andreas Spitz
According to journalistic standards, direct quotes should be attributed to sources with objective quotatives such as ``said'' and ``told,'' since nonobjective quotatives, e.g., ``argued'' and ``insisted,'' would influence the readers' perception of the quote and the quoted person. In this paper, we analyze the adherence to this journalistic norm to study trends in objectivity in political news across U.S. outlets of different ideological leanings. We ask: 1) How has the usage of nonobjective quotatives evolved? 2) How do news outlets use nonobjective quotatives when covering politicians of different parties? To answer these questions, we developed a dependency-parsing-based method to extract quotatives and applied it to Quotebank, a web-scale corpus of attributed quotes, obtaining nearly 7 million quotes, each enriched with the quoted speaker's political party and the ideological leaning of the outlet that published the quote. We find that, while partisan outlets are the ones that most often use nonobjective quotatives, between 2013 and 2020, the outlets that increased their usage of nonobjective quotatives the most were ``moderate'' centrist news outlets (around 0.6 percentage points, or 20% in relative percentage over seven years). Further, we find that outlets use nonobjective quotatives more often when quoting politicians of the opposing ideology (e.g., left-leaning outlets quoting Republicans) and that this ``quotative bias'' is rising at a swift pace, increasing up to 0.5 percentage points, or 25% in relative percentage, per year. These findings suggest an overall decline in journalistic objectivity in U.S. political news.
{"title":"Quotatives Indicate Decline in Objectivity in U.S. Political News","authors":"Tiancheng Hu, Manoel Horta Ribeiro, Robert West, Andreas Spitz","doi":"10.1609/icwsm.v17i1.22152","DOIUrl":"https://doi.org/10.1609/icwsm.v17i1.22152","url":null,"abstract":"According to journalistic standards, direct quotes should be attributed to sources with objective quotatives such as ``said'' and ``told,'' since nonobjective quotatives, e.g., ``argued'' and ``insisted,'' would influence the readers' perception of the quote and the quoted person. In this paper, we analyze the adherence to this journalistic norm to study trends in objectivity in political news across U.S. outlets of different ideological leanings. We ask: 1) How has the usage of nonobjective quotatives evolved? 2) How do news outlets use nonobjective quotatives when covering politicians of different parties? To answer these questions, we developed a dependency-parsing-based method to extract quotatives and applied it to Quotebank, a web-scale corpus of attributed quotes, obtaining nearly 7 million quotes, each enriched with the quoted speaker's political party and the ideological leaning of the outlet that published the quote. We find that, while partisan outlets are the ones that most often use nonobjective quotatives, between 2013 and 2020, the outlets that increased their usage of nonobjective quotatives the most were ``moderate'' centrist news outlets (around 0.6 percentage points, or 20% in relative percentage over seven years). Further, we find that outlets use nonobjective quotatives more often when quoting politicians of the opposing ideology (e.g., left-leaning outlets quoting Republicans) and that this ``quotative bias'' is rising at a swift pace, increasing up to 0.5 percentage points, or 25% in relative percentage, per year. These findings suggest an overall decline in journalistic objectivity in U.S. political news.","PeriodicalId":338112,"journal":{"name":"Proceedings of the International AAAI Conference on Web and Social Media","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136040991","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-06-02DOI: 10.1609/icwsm.v17i1.22200
Jiaqing Yuan, Munindar P. Singh
Conversations among online users sometimes derail, i.e., break down into personal attacks. Derailment interferes with the healthy growth of communities in cyberspace. The ability to predict whether an ongoing conversation will derail could provide valuable advance, even real-time, insight to both interlocutors and moderators. Prior approaches predict conversation derailment retrospectively without the ability to forestall the derailment proactively. Some existing works attempt to make dynamic predictions as the conversation develops, but fail to incorporate multisource information, such as conversational structure and distance to derailment. We propose a hierarchical transformer-based framework that combines utterance-level and conversation-level information to capture fine-grained contextual semantics. We propose a domain-adaptive pretraining objective to unite conversational structure information and a multitask learning scheme to leverage the distance from each utterance to derailment. An evaluation of our framework on two conversation derailment datasets shows an improvement in F1 score for the prediction of derailment. These results demonstrate the effectiveness of incorporating multisource information for predicting the derailment of a conversation.
{"title":"Conversation Modeling to Predict Derailment","authors":"Jiaqing Yuan, Munindar P. Singh","doi":"10.1609/icwsm.v17i1.22200","DOIUrl":"https://doi.org/10.1609/icwsm.v17i1.22200","url":null,"abstract":"Conversations among online users sometimes derail, i.e., break down into personal attacks. Derailment interferes with the healthy growth of communities in cyberspace. The ability to predict whether an ongoing conversation will derail could provide valuable advance, even real-time, insight to both interlocutors and moderators. Prior approaches predict conversation derailment retrospectively without the ability to forestall the derailment proactively. Some existing works attempt to make dynamic predictions as the conversation develops, but fail to incorporate multisource information, such as conversational structure and distance to derailment. We propose a hierarchical transformer-based framework that combines utterance-level and conversation-level information to capture fine-grained contextual semantics. We propose a domain-adaptive pretraining objective to unite conversational structure information and a multitask learning scheme to leverage the distance from each utterance to derailment. An evaluation of our framework on two conversation derailment datasets shows an improvement in F1 score for the prediction of derailment. These results demonstrate the effectiveness of incorporating multisource information for predicting the derailment of a conversation.","PeriodicalId":338112,"journal":{"name":"Proceedings of the International AAAI Conference on Web and Social Media","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136041107","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-06-02DOI: 10.1609/icwsm.v17i1.22180
Utkarsh Patel, Animesh Mukherjee, Mainack Mondal
Today, participating in discussions on online forums is extremely commonplace and these discussions have started rendering a strong influence on the overall opinion of online users. Naturally, twisting the flow of the argument can have a strong impact on the minds of naive users, which in the long run might have socio-political ramifications, for example, winning an election or spreading targeted misinformation. Thus, these platforms are potentially highly vulnerable to malicious players who might act individually or as a cohort to breed fallacious arguments with a motive to sway public opinion. Ad hominem arguments are one of the most effective forms of such fallacies. Although a simple fallacy, it is effective enough to sway public debates in offline world and can be used as a precursor to shutting down the voice of opposition by slander. In this work, we take a first step in shedding light on the usage of ad hominem fallacies in the wild. First, we build a powerful ad hominem detector based on transformer architecture with high accuracy (F1 more than 83%, showing a significant improvement over prior work), even for datasets for which annotated instances constitute a very small fraction. We then used our detector on 265k arguments collected from the online debate forum – CreateDebate. Our crowdsourced surveys validate our in-the-wild predictions on CreateDebate data (94% match with manual annotation). Our analysis revealed that a surprising 31.23% of CreateDebate content contains ad hominem fallacy, and a cohort of highly active users post significantly more ad hominem to suppress opposing views. Then, our temporal analysis revealed that ad hominem argument usage increased significantly since the 2016 US Presidential election, not only for topics like Politics, but also for Science and Law. We conclude by discussing important implications of our work to detect and defend against ad hominem fallacies.
{"title":"\"Dummy Grandpa, Do You Know Anything?\": Identifying and Characterizing Ad Hominem Fallacy Usage in the Wild","authors":"Utkarsh Patel, Animesh Mukherjee, Mainack Mondal","doi":"10.1609/icwsm.v17i1.22180","DOIUrl":"https://doi.org/10.1609/icwsm.v17i1.22180","url":null,"abstract":"Today, participating in discussions on online forums is extremely commonplace and these discussions have started rendering a strong influence on the overall opinion of online users. Naturally, twisting the flow of the argument can have a strong impact on the minds of naive users, which in the long run might have socio-political ramifications, for example, winning an election or spreading targeted misinformation. Thus, these platforms are potentially highly vulnerable to malicious players who might act individually or as a cohort to breed fallacious arguments with a motive to sway public opinion. Ad hominem arguments are one of the most effective forms of such fallacies. Although a simple fallacy, it is effective enough to sway public debates in offline world and can be used as a precursor to shutting down the voice of opposition by slander. In this work, we take a first step in shedding light on the usage of ad hominem fallacies in the wild. First, we build a powerful ad hominem detector based on transformer architecture with high accuracy (F1 more than 83%, showing a significant improvement over prior work), even for datasets for which annotated instances constitute a very small fraction. We then used our detector on 265k arguments collected from the online debate forum – CreateDebate. Our crowdsourced surveys validate our in-the-wild predictions on CreateDebate data (94% match with manual annotation). Our analysis revealed that a surprising 31.23% of CreateDebate content contains ad hominem fallacy, and a cohort of highly active users post significantly more ad hominem to suppress opposing views. Then, our temporal analysis revealed that ad hominem argument usage increased significantly since the 2016 US Presidential election, not only for topics like Politics, but also for Science and Law. We conclude by discussing important implications of our work to detect and defend against ad hominem fallacies.","PeriodicalId":338112,"journal":{"name":"Proceedings of the International AAAI Conference on Web and Social Media","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136041101","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-06-02DOI: 10.1609/icwsm.v17i1.22133
Lucia L. Chen, Steven R. Wilson, Sophie Lohmann, Daniela V. Negraia
COVID-19 poses disproportionate mental health consequences to the public during different phases of the pandemic. We use a computational approach to capture the specific aspects that trigger the public's anxiety about the pandemic and investigate how these aspects change over time. First, we identified nine subjects of anxiety (SOAs) in a sample of Reddit posts (N=86) from r/COVID19_support using the thematic analysis approach. Then, we quantified Reddit users' anxiety by training algorithms on a manually annotated sample (N=793) to annotate the SOAs in a larger chronological sample (N=6,535). The nine SOAs align with items in various recently developed pandemic anxiety measurement scales. We observed that Reddit users' concerns about health risks remained high in the first eight months since the pandemic started. These concerns diminished dramatically despite the surge of cases occurring later. In general, users' language disclosing the SOAs became less intense as the pandemic progressed. However, worries about mental health and the future steadily increased throughout the period covered in this study. People also tended to use more intense language to describe mental health concerns than health risk or death concerns. Our results suggest that the public's mental health condition does not necessarily improve despite COVID-19 as a health threat gradually weakening due to appropriate countermeasures. Our system lays the groundwork for population health and epidemiology scholars to examine aspects that provoke pandemic anxiety in a timely fashion.
{"title":"What Are You Anxious About? Examining Subjects of Anxiety during the COVID-19 Pandemic","authors":"Lucia L. Chen, Steven R. Wilson, Sophie Lohmann, Daniela V. Negraia","doi":"10.1609/icwsm.v17i1.22133","DOIUrl":"https://doi.org/10.1609/icwsm.v17i1.22133","url":null,"abstract":"COVID-19 poses disproportionate mental health consequences to the public during different phases of the pandemic. We use a computational approach to capture the specific aspects that trigger the public's anxiety about the pandemic and investigate how these aspects change over time. First, we identified nine subjects of anxiety (SOAs) in a sample of Reddit posts (N=86) from r/COVID19_support using the thematic analysis approach. Then, we quantified Reddit users' anxiety by training algorithms on a manually annotated sample (N=793) to annotate the SOAs in a larger chronological sample (N=6,535). The nine SOAs align with items in various recently developed pandemic anxiety measurement scales. We observed that Reddit users' concerns about health risks remained high in the first eight months since the pandemic started. These concerns diminished dramatically despite the surge of cases occurring later. In general, users' language disclosing the SOAs became less intense as the pandemic progressed. However, worries about mental health and the future steadily increased throughout the period covered in this study. People also tended to use more intense language to describe mental health concerns than health risk or death concerns. Our results suggest that the public's mental health condition does not necessarily improve despite COVID-19 as a health threat gradually weakening due to appropriate countermeasures. Our system lays the groundwork for population health and epidemiology scholars to examine aspects that provoke pandemic anxiety in a timely fashion.","PeriodicalId":338112,"journal":{"name":"Proceedings of the International AAAI Conference on Web and Social Media","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136041103","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-06-02DOI: 10.1609/icwsm.v17i1.22228
Jürgen Pfeffer, Daniel Matter, Anahit Sargsyan
Twitter has started to share an impression count variable as part of the available public metrics for every Tweet collected with Twitter’s APIs. With the information about how often a particular Tweet has been shown to Twitter users at the time of data collection, we can learn important insights about the dissemination process of a Tweet by measuring its impression count repeatedly over time. With our preliminary analysis, we can show that on average the peak of impressions per second is 72 seconds after a Tweet was sent and that after 24 hours, no relevant number of impressions can be observed for ∼95% of all Tweets. Finally, we estimate that the median half-life of a Tweet, i.e. the time it takes before half of all impressions are created, is about 80 minutes.
{"title":"The Half-Life of a Tweet","authors":"Jürgen Pfeffer, Daniel Matter, Anahit Sargsyan","doi":"10.1609/icwsm.v17i1.22228","DOIUrl":"https://doi.org/10.1609/icwsm.v17i1.22228","url":null,"abstract":"Twitter has started to share an impression count variable as part of the available public metrics for every Tweet collected with Twitter’s APIs. With the information about how often a particular Tweet has been shown to Twitter users at the time of data collection, we can learn important insights about the dissemination process of a Tweet by measuring its impression count repeatedly over time. With our preliminary analysis, we can show that on average the peak of impressions per second is 72 seconds after a Tweet was sent and that after 24 hours, no relevant number of impressions can be observed for ∼95% of all Tweets. Finally, we estimate that the median half-life of a Tweet, i.e. the time it takes before half of all impressions are created, is about 80 minutes.","PeriodicalId":338112,"journal":{"name":"Proceedings of the International AAAI Conference on Web and Social Media","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136041110","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-06-02DOI: 10.1609/icwsm.v17i1.22145
Siyi Guo, Negar Mokhberian, Kristina Lerman
Language models can be trained to recognize the moral sentiment of text, creating new opportunities to study the role of morality in human life. As interest in language and morality has grown, several ground truth datasets with moral annotations have been released. However, these datasets vary in the method of data collection, domain, topics, instructions for annotators, etc. Simply aggregating such heterogeneous datasets during training can yield models that fail to generalize well. We describe a data fusion framework for training on multiple heterogeneous datasets that improve performance and generalizability. The model uses domain adversarial training to align the datasets in feature space and a weighted loss function to deal with label shift. We show that the proposed framework achieves state-of-the-art performance in different datasets compared to prior works in morality inference.
{"title":"A Data Fusion Framework for Multi-Domain Morality Learning","authors":"Siyi Guo, Negar Mokhberian, Kristina Lerman","doi":"10.1609/icwsm.v17i1.22145","DOIUrl":"https://doi.org/10.1609/icwsm.v17i1.22145","url":null,"abstract":"Language models can be trained to recognize the moral sentiment of text, creating new opportunities to study the role of morality in human life. As interest in language and morality has grown, several ground truth datasets with moral annotations have been released. However, these datasets vary in the method of data collection, domain, topics, instructions for annotators, etc. Simply aggregating such heterogeneous datasets during training can yield models that fail to generalize well. We describe a data fusion framework for training on multiple heterogeneous datasets that improve performance and generalizability. The model uses domain adversarial training to align the datasets in feature space and a weighted loss function to deal with label shift. We show that the proposed framework achieves state-of-the-art performance in different datasets compared to prior works in morality inference.","PeriodicalId":338112,"journal":{"name":"Proceedings of the International AAAI Conference on Web and Social Media","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136040981","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-06-02DOI: 10.1609/icwsm.v17i1.22205
Rachith Aiyappa, Matthew R. DeVerna, Manita Pote, Bao Tran Truong, Wanying Zhao, David Axelrod, Aria Pessianzadeh, Zoher Kachwala, Munjung Kim, Ozgur Can Seckin, Minsuk Kim, Sunny Gandhi, Amrutha Manikonda, Francesco Pierri, Filippo Menczer, Kai-Cheng Yang
Social media are utilized by millions of citizens to discuss important political issues. Politicians use these platforms to connect with the public and broadcast policy positions. Therefore, data from social media has enabled many studies of political discussion. While most analyses are limited to data from individual platforms, people are embedded in a larger information ecosystem spanning multiple social networks. Here we describe and provide access to the Indiana University 2022 U.S. Midterms Multi-Platform Social Media Dataset (MEIU22), a collection of social media posts from Twitter, Facebook, Instagram, Reddit, and 4chan. MEIU22 links to posts about the midterm elections based on a comprehensive list of keywords and tracks the social media accounts of 1,011 candidates from October 1 to December 25, 2022. We also publish the source code of our pipeline to enable similar multi-platform research projects.
{"title":"A Multi-Platform Collection of Social Media Posts about the 2022 U.S. Midterm Elections","authors":"Rachith Aiyappa, Matthew R. DeVerna, Manita Pote, Bao Tran Truong, Wanying Zhao, David Axelrod, Aria Pessianzadeh, Zoher Kachwala, Munjung Kim, Ozgur Can Seckin, Minsuk Kim, Sunny Gandhi, Amrutha Manikonda, Francesco Pierri, Filippo Menczer, Kai-Cheng Yang","doi":"10.1609/icwsm.v17i1.22205","DOIUrl":"https://doi.org/10.1609/icwsm.v17i1.22205","url":null,"abstract":"Social media are utilized by millions of citizens to discuss important political issues. Politicians use these platforms to connect with the public and broadcast policy positions. Therefore, data from social media has enabled many studies of political discussion. While most analyses are limited to data from individual platforms, people are embedded in a larger information ecosystem spanning multiple social networks. Here we describe and provide access to the Indiana University 2022 U.S. Midterms Multi-Platform Social Media Dataset (MEIU22), a collection of social media posts from Twitter, Facebook, Instagram, Reddit, and 4chan. MEIU22 links to posts about the midterm elections based on a comprehensive list of keywords and tracks the social media accounts of 1,011 candidates from October 1 to December 25, 2022. We also publish the source code of our pipeline to enable similar multi-platform research projects.","PeriodicalId":338112,"journal":{"name":"Proceedings of the International AAAI Conference on Web and Social Media","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136041295","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
On June 24, 2022, the United States Supreme Court overturned landmark rulings made in its 1973 verdict in Roe v. Wade. The justices by way of a majority vote in Dobbs v. Jackson Women's Health Organization, decided that abortion wasn't a constitutional right and returned the issue of abortion to the elected representatives. This decision triggered multiple protests and debates across the US, especially in the context of the midterm elections in November 2022. Given that many citizens use social media platforms to express their views and mobilize for collective action, and given that online debate provides tangible effects on public opinion, political participation, news media coverage, and the political decision-making, it is crucial to understand online discussions surrounding this topic. Toward this end, we present the first large-scale Twitter dataset collected on the abortion rights debate in the United States. We present a set of 74M tweets systematically collected over the course of one year from January 1, 2022 to January 6, 2023.
{"title":"#RoeOverturned: Twitter Dataset on the Abortion Rights Controversy","authors":"Rong-Ching Chang, Ashwin Rao, Qiankun Zhong, Magdalena Wojcieszak, Kristina Lerman","doi":"10.1609/icwsm.v17i1.22207","DOIUrl":"https://doi.org/10.1609/icwsm.v17i1.22207","url":null,"abstract":"On June 24, 2022, the United States Supreme Court overturned landmark rulings made in its 1973 verdict in Roe v. Wade. The justices by way of a majority vote in Dobbs v. Jackson Women's Health Organization, decided that abortion wasn't a constitutional right and returned the issue of abortion to the elected representatives. This decision triggered multiple protests and debates across the US, especially in the context of the midterm elections in November 2022. Given that many citizens use social media platforms to express their views and mobilize for collective action, and given that online debate provides tangible effects on public opinion, political participation, news media coverage, and the political decision-making, it is crucial to understand online discussions surrounding this topic. Toward this end, we present the first large-scale Twitter dataset collected on the abortion rights debate in the United States. We present a set of 74M tweets systematically collected over the course of one year from January 1, 2022 to January 6, 2023.","PeriodicalId":338112,"journal":{"name":"Proceedings of the International AAAI Conference on Web and Social Media","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135909940","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}