This article introduces a heterophily-based metric for assessing polarization in social networks when different opposing ideological communities coexist. The proposed metric measures polarization at the node level and is based on a node’s affinity for other communities. Node-level values can then be aggregated at the community, network, or any intermediate level, resulting in a more comprehensive map of polarization. We looked at our metric on the Polblogs network, the White Helmets Twitter interaction network with two communities, and the VoterFraud2020 domain network with five communities. Additionally, we evaluated our metric on different sets of synthetic graphs to confirm that it yields low polarization scores, as expected. We employed three ways to build synthetic networks: synthetic labeling, dK-series, and network models, in order to assess how the proposed measure behaves to various topologies and network features. Then, we compared our metric to two commonly used polarization metrics, Guerra’s boundary polarization and the random walk controversy score. We also examined how our suggested metric correlates with two network metrics: assortativity and modularity.
The main objective of our research is to gain a comprehensive understanding of the relationship between language usage within different communities and delineating the ideological narratives. We focus specifically on utilizing Natural Language Processing techniques to identify underlying narratives in the coded or suggestive language employed by non-normative communities associated with targeted violence. Earlier studies addressed the detection of ideological affiliation through surveys, user studies, and a limited number based on the content of text articles, which still require label curation. Previous work addressed label curation by using ideological subreddits (r/Liberal and r/Conservative for Liberal and Conservative classes) to label the articles shared on those subreddits according to their prescribed ideologies, albeit with a limited dataset.
Building upon previous work, we use subreddit ideologies to categorize shared articles. In addition to the conservative and liberal classes, we introduce a new category called “Restricted” which encompasses text articles shared in subreddits that are restricted, privatized, or banned, such as r/TheDonald. The “Restricted” class encompasses posts tied to violence, regardless of conservative or liberal affiliations. Additionally, we augment our dataset with text articles from self-identified subreddits like r/progressive and r/askaconservative for the liberal and conservative classes, respectively. This results in an expanded dataset of 377,144 text articles, consisting of 72,488 liberal, 79,573 conservative, and 225,083 restricted class articles. Our goal is to analyze language variances in different ideological communities, investigate keyword relevance in labeling article orientations, especially in unseen cases (922,522 text articles), and delve into radicalized communities, conducting thorough analysis and interpretation of the results.
Ensuring the security of personal accounts has become a key concern due to the widespread password attack techniques. Although passwords are the primary defense against unauthorized access, the practice of reusing easy-to-remember passwords increases security risks for people. Traditional methods for evaluating password strength are often insufficient since they overlook the public personal information that users frequently share on social networks. In addition, while users tend to limit access to their data on single profiles, personal data is often unintentionally shared across multiple profiles, exposing users to password threats. In this paper, we present an extension of a data reconstruction tool, namely soda advance, which incorporates a new module to evaluate password strength based on publicly available data across multiple social networks. It relies on a new metric to provide a comprehensive evaluation of password strength. Moreover, we investigate the capabilities and risks associated with emerging Large Language Models (LLMs) in evaluating and generating passwords, respectively. Specifically, by exploiting the proliferation of LLMs, it has been possible to interact with many LLMs through Automated Template Learning methodologies. Experimental evaluations, performed with 100 real users, demonstrate the effectiveness of LLMs in generating strong passwords with respect to data associated with users’ profiles. Furthermore, LLMs have proved to be effective also in evaluation tasks, but the combined usage of LLMs and soda advance guaranteed better classifications up to more than 10% in terms of F1-score.
This study unfolds nuanced insights into the diverse dimensions dictating the success of social media influencers. Analyzing more than 210,000 social media posts and utilizing the Heuristic-Systematic Model of Information Processing (HSM), this study explores diverse factors, including individual appearance characteristics, depth of persuasive power, and various influencer types. The findings of this study shed light on the distinct impacts of varying influencer archetypes, such as celebrities and micro-celebrities, on user engagement and reveal the nuanced moderating effects of these archetypes on the relationships intertwined with personal attributes, persuasive potency, and influencer success. The proposed model advocates that influencers who leverage more profound, systematic processing strategies, marked by detailed information analysis and conveyance, are poised to experience elevated user engagement compared to counterparts employing heuristic modalities, distinguished by practical mental shortcuts and superficial examinations. This elucidation accentuates the imperative of harmonizing heuristic and systematic methodologies for emerging influencers and brands aspiring to optimize user engagement and efficaciously mold consumer behavior. This paper encapsulates a comprehensive exploration of the dynamic landscapes of influencer marketing via the HSM prism, delivering profound insights and practical ramifications for scholars, marketers, and influencers aiming to navigate and exploit the intricate networks of influential determinants in the ever-evolving digital marketing domain.
Echo chambers naturally occur on social networks, where individuals join groups to share and discuss their own interests driven by algorithms that steer their beliefs and behaviours based on their emotions, biases, and cognitive vulnerabilities. According to recent research on information manipulation and interference, echo chambers have become crucial weapons in the arsenal of Cognitive Warfare for amplifying the effect of psychological techniques aimed at altering information and narratives to influence public perception and shape opinions. The research is focusing on the definition of assessment methods for detecting emerging echo chambers and monitoring their evolution over time. In this sense, this work stresses the complementary role of the existing topology-based metrics and the semantics of the viewpoints underlying groups as well as their belonging users. Indeed, this paper proposes a metric based on consensus Group Decision-Making (GDM) that acquires community members’ opinions through Aspect-Based Sentiment Analysis (ABSA) and applies consensus metrics to determine the agreement within a single community and between distinct communities. The potential of the proposed metrics have been evaluated on two public datasets of tweets through comparisons with sentiment-aware opinions analysis and state-of-the-art metrics for polarization and echo chamber detection. The results reveal that topology-based metrics strictly depending on random walks over the individuals are not sufficient to fully depict the communities closeness on topics and their prevailing beliefs coming out from content analysis.
Users are often overwhelmed by the amount of information generated on online social networks and media (OSNEM), in particular Twitter, during particular events. Summarizing the information streams would help them be informed in a reasonable time. In parallel, recent state of the art in summarization has a special focus on deep neural models and pre-trained language models.
In this context, we aim at (i) evaluating different pre-trained language model (PLM) to represent microblogs (i.e., tweets), and (ii) to identify the most suitable ones in a summarization context, as well as (iii) to see how neural models can be used knowing the issue of input size limitation of such models. For this purpose, we divided the problem into 3 questions and made experiments on 3 different datasets. Using a simple greedy algorithm, we first compared several pre-trained models for single tweet representation. We then evaluated the quality of the average representation of the stream and sought to use it as a starting point for a neural approach. First results show the interest of using USE and Sentence-BERT representations for tweet stream summarization, as well as the great potential of using the average representation of the stream.
News stories circulating online, especially on social media platforms, are nowadays a primary source of information. Given the nature of social media, news no longer are just news, but they are embedded in the conversations of users interacting with them. This is particularly relevant for inaccurate information or even outright misinformation because user interaction has a crucial impact on whether information is uncritically disseminated or not. Biased coverage has been shown to affect personal decision-making. Still, it remains an open question whether users are aware of the biased reporting they encounter and how they react to it. The latter is particularly relevant given that user reactions help contextualize reporting for other users and can thus help mitigate but may also exacerbate the impact of biased media coverage.
This paper approaches the question from a measurement point of view, examining whether reactions to news articles on Twitter can serve as bias indicators, i.e., whether how users comment on a given article relates to its actual level of bias. We first give an overview of research on media bias before discussing key concepts related to how individuals engage with online content, focusing on the sentiment (or valance) of comments and on outright hate speech. We then present the first dataset connecting reliable human-made media bias classifications of news articles with the reactions these articles received on Twitter. We call our dataset BAT - Bias And Twitter. BAT covers 2,800 (bias-rated) news articles from 255 English-speaking news outlets. Additionally, BAT includes 175,807 comments and retweets referring to the articles.
Based on BAT, we conduct a multi-feature analysis to identify comment characteristics and analyze whether Twitter reactions correlate with an article’s bias. First, we fine-tune and apply two XLNet-based classifiers for hate speech detection and sentiment analysis. Second, we relate the results of the classifiers to the article bias annotations within a multi-level regression. The results show that Twitter reactions to an article indicate its bias, and vice-versa. With a regression coefficient of 0.703 (), we specifically present evidence that Twitter reactions to biased articles are significantly more hateful. Our analysis shows that the news outlet’s individual stance reinforces the hate-bias relationship. In future work, we will extend the dataset and analysis, including additional concepts related to media bias.