Pub Date : 2023-09-01DOI: 10.1016/j.osnem.2023.100267
Ho-Chun Herbert Chang , Becky Pham , Emilio Ferrara
We examine an unexpected but significant source of positive public health messaging during the COVID-19 pandemic—K-pop fandoms. Leveraging more than 7 million tweets related to mask-wearing and K-pop between March 2020 and December 2021, we analyzed the online spread of the hashtag #WearAMask and vaccine-related tweets amid anti-mask sentiments and public health misinformation. Analyses reveal the South Korean boyband BTS as one of the most significant driver of health discourse. Tweets from health agencies and prominent figures that mentioned K-pop generate 111 times more online responses compared to tweets that did not. These tweets also elicited strong responses from South America, Southeast Asia, and interior States—areas often neglected by mainstream social media campaigns. Network and temporal analysis show increased use from right-leaning elites over time. Mechanistically, strong-levels of parasocial engagement and connectedness allow sustained activism in the community. Our results suggest that public health institutions may leverage pre-existing audience markets to synergistically diffuse and target under-served communities both domestically and globally, especially during health crises.
{"title":"Parasocial diffusion: K-pop fandoms help drive COVID-19 public health messaging on social media","authors":"Ho-Chun Herbert Chang , Becky Pham , Emilio Ferrara","doi":"10.1016/j.osnem.2023.100267","DOIUrl":"https://doi.org/10.1016/j.osnem.2023.100267","url":null,"abstract":"<div><p>We examine an unexpected but significant source of positive public health messaging during the COVID-19 pandemic—K-pop fandoms. Leveraging more than 7 million tweets related to mask-wearing and K-pop between March 2020 and December 2021, we analyzed the online spread of the hashtag #WearAMask and vaccine-related tweets amid anti-mask sentiments and public health misinformation. Analyses reveal the South Korean boyband BTS as one of the most significant driver of health discourse. Tweets from health agencies and prominent figures that mentioned K-pop generate 111 times more online responses compared to tweets that did not. These tweets also elicited strong responses from South America, Southeast Asia, and interior States—areas often neglected by mainstream social media campaigns. Network and temporal analysis show increased use from right-leaning elites over time. Mechanistically, strong-levels of parasocial engagement and connectedness allow sustained activism in the community. Our results suggest that public health institutions may leverage pre-existing audience markets to synergistically diffuse and target under-served communities both domestically and globally, especially during health crises.</p></div>","PeriodicalId":52228,"journal":{"name":"Online Social Networks and Media","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49701459","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-09-01DOI: 10.1016/j.osnem.2023.100263
Onur Varol
Bots, simply defined as accounts controlled by automation, can be used as a weapon for online manipulation and pose a threat to the health of platforms. Researchers have studied online platforms to detect, estimate, and characterize bot accounts. Concerns about the prevalence of bots were raised following Elon Musk’s bid to acquire Twitter. In this work, we want to stress that crucial questions need to be answered in order to make a proper estimation and compare different methodologies and definitions based on behaviors and activities; otherwise the real questions concerning the health of online platforms will be confounded by disagreements about definitions and models. We argue how assumptions on bot-likely behavior, the detection approach, and the population inspected can affect the estimation of the percentage of bots on Twitter. Finally, we emphasize the responsibility of platforms to be vigilant, transparent, and unbiased in dealing with threats that may affect their users.
{"title":"Should we agree to disagree about Twitter’s bot problem?","authors":"Onur Varol","doi":"10.1016/j.osnem.2023.100263","DOIUrl":"https://doi.org/10.1016/j.osnem.2023.100263","url":null,"abstract":"<div><p>Bots, simply defined as accounts controlled by automation, can be used as a weapon for online manipulation and pose a threat to the health of platforms. Researchers have studied online platforms to detect, estimate, and characterize bot accounts. Concerns about the prevalence of bots were raised following Elon Musk’s bid to acquire Twitter. In this work, we want to stress that crucial questions need to be answered in order to make a proper estimation and compare different methodologies and definitions based on behaviors and activities; otherwise the real questions concerning the health of online platforms will be confounded by disagreements about definitions and models. We argue how assumptions on bot-likely behavior, the detection approach, and the population inspected can affect the estimation of the percentage of bots on Twitter. Finally, we emphasize the responsibility of platforms to be vigilant, transparent, and unbiased in dealing with threats that may affect their users.</p></div>","PeriodicalId":52228,"journal":{"name":"Online Social Networks and Media","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49728252","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-09-01DOI: 10.1016/j.osnem.2023.100255
Felipe de C. Pereira, P. D. de Rezende
{"title":"The Least Cost Directed Perfect Awareness Problem: complexity, algorithms and computations","authors":"Felipe de C. Pereira, P. D. de Rezende","doi":"10.1016/j.osnem.2023.100255","DOIUrl":"https://doi.org/10.1016/j.osnem.2023.100255","url":null,"abstract":"","PeriodicalId":52228,"journal":{"name":"Online Social Networks and Media","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"54996800","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The increasing popularity of online social networks (OSNs) attracted growing interest in modeling social interactions. On online social platforms, a few individuals, commonly referred to as influencers, produce the majority of content consumed by users and hegemonize the landscape of the social debate. However, classical opinion models do not capture this communication asymmetry. We develop an opinion model inspired by observations on social media platforms with two main objectives: first, to describe this inherent communication asymmetry in OSNs, and second, to model the effects of content personalization. We derive a Fokker–Planck equation for the temporal evolution of users’ opinion distribution and analytically characterize the stationary system behavior. Analytical results, confirmed by Monte-Carlo simulations, show how strict forms of content personalization tend to radicalize user opinion, leading to the emergence of echo chambers, and favor structurally advantaged influencers. As an example application, we apply our model to Facebook data during the Italian government crisis in 2019. Our work provides a flexible framework to evaluate the impact of content personalization on the opinion formation process, focusing on the interaction between influential individuals and regular users. This framework is interesting in the context of marketing and advertising, misinformation spreading, politics and activism.
{"title":"Modeling communication asymmetry and content personalization in online social networks","authors":"Franco Galante , Luca Vassio , Michele Garetto , Emilio Leonardi","doi":"10.1016/j.osnem.2023.100269","DOIUrl":"https://doi.org/10.1016/j.osnem.2023.100269","url":null,"abstract":"<div><p><span>The increasing popularity of online social networks (OSNs) attracted growing interest in modeling social interactions. On online social platforms, a few individuals, commonly referred to as </span><em>influencers</em><span>, produce the majority of content consumed by users and hegemonize the landscape of the social debate. However, classical opinion models do not capture this communication asymmetry. We develop an opinion model inspired by observations on social media platforms<span> with two main objectives: first, to describe this inherent communication asymmetry in OSNs, and second, to model the effects of content personalization. We derive a Fokker–Planck equation for the temporal evolution of users’ opinion distribution and analytically characterize the stationary system behavior. Analytical results, confirmed by Monte-Carlo simulations, show how strict forms of content personalization tend to radicalize user opinion, leading to the emergence of </span></span><em>echo chambers</em>, and favor <em>structurally advantaged</em><span> influencers. As an example application, we apply our model to Facebook data during the Italian government crisis in 2019. Our work provides a flexible framework to evaluate the impact of content personalization on the opinion formation process, focusing on the interaction between influential individuals and regular users. This framework is interesting in the context of marketing and advertising, misinformation spreading, politics and activism.</span></p></div>","PeriodicalId":52228,"journal":{"name":"Online Social Networks and Media","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49728432","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-09-01DOI: 10.1016/j.osnem.2023.100268
Jialin Liu, Lin Li, Na Li
In recent years, Online Social Networks (OSN) have become popular content-sharing environments. With the emergence of smartphones with high-quality cameras, people like to share photos of their life moments on OSNs. The photos, however, often contain private information that people do not intend to share with others (e.g., their sensitive relationship). Solely relying on OSN users to manually process photos to protect their relationship can be tedious and error-prone. Therefore, we designed a system to automatically discover sensitive relations in a photo to be shared online and preserve the relations by face blocking techniques. We first used the Decision Tree model to learn sensitive relations from the photos labeled private or public by OSN users. Then we defined a face blocking problem to handle the trade-off between preserving relationship privacy and maintaining the photo utility. To cope with the problem, we developed Greedy and Linear Programming based face blocking technologies. In this paper, we generated synthetic data and used it to evaluate our system performance in terms of privacy protection and photo utility loss.
{"title":"Relationship privacy preservation in photo sharing","authors":"Jialin Liu, Lin Li, Na Li","doi":"10.1016/j.osnem.2023.100268","DOIUrl":"https://doi.org/10.1016/j.osnem.2023.100268","url":null,"abstract":"<div><p>In recent years, Online Social Networks<span> (OSN) have become popular content-sharing environments. With the emergence of smartphones with high-quality cameras, people like to share photos of their life moments on OSNs. The photos, however, often contain private information that people do not intend to share with others (e.g., their sensitive relationship). Solely relying on OSN users to manually process photos to protect their relationship can be tedious and error-prone. Therefore, we designed a system to automatically discover sensitive relations in a photo to be shared online and preserve the relations by face blocking techniques. We first used the Decision Tree model to learn sensitive relations from the photos labeled private or public by OSN users. Then we defined a face blocking problem to handle the trade-off between preserving relationship privacy and maintaining the photo utility. To cope with the problem, we developed Greedy and Linear Programming based face blocking technologies. In this paper, we generated synthetic data and used it to evaluate our system performance in terms of privacy protection and photo utility loss.</span></p></div>","PeriodicalId":52228,"journal":{"name":"Online Social Networks and Media","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49728492","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-09-01DOI: 10.1016/j.osnem.2023.100266
Cheick Tidiane Ba, Matteo Zignani, Sabrina Gaito
The emergence of the Web3 paradigm has led to more and more systems built on blockchain technology and relying on cryptocurrency tokens – both fungible and non-fungible – to sustain themselves and generate profit. The growth and success of these platforms are strongly dependent on the growth and evolution of the trade relationships among users. In this context, it is of paramount importance to understand the mechanism behind the evolution and growth dynamics of these economic ties: however, in these systems the trade relationships are strictly intertwined with social dynamics, posing significant challenges in the analysis. One of the most important mechanisms behind the evolution of social networks is the triadic closure principle: given the strict link between social and economic spheres, the mechanism emerges as a potential candidate among mechanisms in literature. Therefore in this work, we extend the existing methodology for triadic closure studies and adapt it to directed networks. We performed an analysis centered around 3-node subgraphs known as “triads” and statistically significant triads referred to as “triadic motifs”, both from a static and temporal perspective. The methodology was applied to various decentralized socio-economic networks with distinct levels of social components. These networks include currency transfers from the blockchain-based online social media platform Steemit, trade relationships among NFT sellers and buyers on the Ethereum blockchain, and a blockchain-based currency designed for humanitarian aid called Sarafu. Our measurements show how triadic closure is relevant during the evolution of these platforms and, for a few aspects, more impactful than centralized online social networks, where triadic closure is also incentivized by recommendation systems. Moreover, we are able to highlight both similarities and differences across networks with different levels of social components, both from a static and temporal standpoint. Overall our work presents strong evidence that triadic closure is an important evolutionary mechanism in decentralized socio-economic networks. Our findings provide a stepping stone in the study of decentralized socio-economic networks. Understanding the evolution of other decentralized networks, not following the same Web3 paradigm or with different social components will provide valuable insight into the understanding of dynamics in decentralized systems and potentially improve their design process.
{"title":"Characterizing growth in decentralized socio-economic networks through triadic closure-related network motifs","authors":"Cheick Tidiane Ba, Matteo Zignani, Sabrina Gaito","doi":"10.1016/j.osnem.2023.100266","DOIUrl":"10.1016/j.osnem.2023.100266","url":null,"abstract":"<div><p>The emergence of the Web3 paradigm has led to more and more systems built on blockchain technology and relying on cryptocurrency tokens – both fungible and non-fungible – to sustain themselves and generate profit. The growth and success of these platforms are strongly dependent on the growth and evolution of the trade relationships among users. In this context, it is of paramount importance to understand the mechanism behind the evolution and growth dynamics of these economic ties: however, in these systems the trade relationships are strictly intertwined with social dynamics, posing significant challenges in the analysis. One of the most important mechanisms behind the evolution of social networks is the triadic closure principle: given the strict link between social and economic spheres, the mechanism emerges as a potential candidate among mechanisms in literature. Therefore in this work, we extend the existing methodology for triadic closure studies and adapt it to directed networks. We performed an analysis centered around 3-node subgraphs known as “triads” and statistically significant triads referred to as “triadic motifs”, both from a static and temporal perspective. The methodology was applied to various decentralized socio-economic networks with distinct levels of social components. These networks include currency transfers from the blockchain-based online social media platform Steemit, trade relationships among NFT sellers and buyers on the Ethereum blockchain, and a blockchain-based currency designed for humanitarian aid called Sarafu. Our measurements show how triadic closure is relevant during the evolution of these platforms and, for a few aspects, more impactful than centralized online social networks, where triadic closure is also incentivized by recommendation systems. Moreover, we are able to highlight both similarities and differences across networks with different levels of social components, both from a static and temporal standpoint. Overall our work presents strong evidence that triadic closure is an important evolutionary mechanism in decentralized socio-economic networks. Our findings provide a stepping stone in the study of decentralized socio-economic networks. Understanding the evolution of other decentralized networks, not following the same Web3 paradigm or with different social components will provide valuable insight into the understanding of dynamics in decentralized systems and potentially improve their design process.</p></div>","PeriodicalId":52228,"journal":{"name":"Online Social Networks and Media","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43869986","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-07-01DOI: 10.1016/j.osnem.2023.100253
Francisco Bráulio Oliveira , Davoud Mougouei , Amanul Haque , Jaime Simão Sichman , Hoa Khanh Dam , Simon Evans , Aditya Ghose , Munindar P. Singh
The media has been used to disseminate public information amid the Covid-19 pandemic. However, Covid-19 news has triggered emotional responses in people that have impacted their mental well-being and led to news avoidance. To understand the emotional response to Covid-19 news, we studied user comments on news published on Twitter by 37 media outlets in 11 countries from January 2020 to December 2022. We employed a deep-learning-based model to identify the basic human emotions defined by Ekman in comments related to Covid-19 news. Additionally, we implemented Latent Dirichlet Allocation (LDA) to identify the news topics. Our analysis found that while nearly half of the user comments showed no significant emotions, negative emotions were more common. Anger was the most prevalent emotion, particularly in the media and comments regarding political responses and governmental actions in the United States. On the other hand, joy was mainly linked to media outlets from the Philippines and news about vaccination. Over time, anger consistently remained the most prevalent emotion, with fear being most prevalent at the start of the pandemic but decreasing over time, occasionally spiking with news on Covid-19 variants, cases, and deaths. Emotions also varied across media outlets, with Fox News being associated with the highest level of disgust, the second-highest level of anger, and the lowest level of fear. Sadness was highest at Citizen TV, SABC, and Nation Africa, all three African media outlets. Additionally, fear was most evident in the comments on news from The Times of India.
{"title":"Beyond fear and anger: A global analysis of emotional response to Covid-19 news on Twitter","authors":"Francisco Bráulio Oliveira , Davoud Mougouei , Amanul Haque , Jaime Simão Sichman , Hoa Khanh Dam , Simon Evans , Aditya Ghose , Munindar P. Singh","doi":"10.1016/j.osnem.2023.100253","DOIUrl":"https://doi.org/10.1016/j.osnem.2023.100253","url":null,"abstract":"<div><p>The media has been used to disseminate public information amid the Covid-19 pandemic. However, Covid-19 news has triggered emotional responses in people that have impacted their mental well-being and led to news avoidance. To understand the emotional response to Covid-19 news, we studied user comments on news published on Twitter by 37 media outlets in 11 countries from January 2020 to December 2022. We employed a deep-learning-based model to identify the basic human emotions defined by Ekman in comments related to Covid-19 news. Additionally, we implemented Latent Dirichlet Allocation (LDA) to identify the news topics. Our analysis found that while nearly half of the user comments showed no significant emotions, negative emotions were more common. Anger was the most prevalent emotion, particularly in the media and comments regarding political responses and governmental actions in the United States. On the other hand, joy was mainly linked to media outlets from the Philippines and news about vaccination. Over time, anger consistently remained the most prevalent emotion, with fear being most prevalent at the start of the pandemic but decreasing over time, occasionally spiking with news on Covid-19 variants, cases, and deaths. Emotions also varied across media outlets, with Fox News being associated with the highest level of disgust, the second-highest level of anger, and the lowest level of fear. Sadness was highest at Citizen TV, SABC, and Nation Africa, all three African media outlets. Additionally, fear was most evident in the comments on news from The Times of India.</p></div>","PeriodicalId":52228,"journal":{"name":"Online Social Networks and Media","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49888615","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-07-01DOI: 10.1016/j.osnem.2023.100254
Maddalena Amendola , Andrea Passarella , Raffaele Perego
Social Search research studies methodologies exploiting social information to better satisfy user information needs in Online Social Media while simplifying the search effort and consequently reducing the time spent and the computational resources utilized. Starting from previous studies, in this work, we analyze the current state of the art of the Social Search area, proposing a new taxonomy and highlighting current limitations and open research directions. We divide the Social Search area into three subcategories, where the social aspect plays a pivotal role: Social Question&Answering, Social Content Search, and Social Collaborative Search. For each subcategory, we present the key concepts and selected representative approaches in the literature in greater detail. We found that, up to now, a large body of studies model users’ preferences and their relations by simply combining social features made available by social platforms. It paves the way for significant research to exploit more structured information about users’ social profiles and behaviours (as they can be inferred from data available on social platforms) to optimize their information needs further.
{"title":"Social search: Retrieving information in Online Social platforms – A survey","authors":"Maddalena Amendola , Andrea Passarella , Raffaele Perego","doi":"10.1016/j.osnem.2023.100254","DOIUrl":"https://doi.org/10.1016/j.osnem.2023.100254","url":null,"abstract":"<div><p><em>Social Search</em> research studies methodologies exploiting social information to better satisfy user information needs in Online Social Media while simplifying the search effort and consequently reducing the time spent and the computational resources utilized. Starting from previous studies, in this work, we analyze the current state of the art of the Social Search area, proposing a new taxonomy and highlighting current limitations and open research directions. We divide the Social Search area into three subcategories, where the social aspect plays a pivotal role: <em>Social Question&Answering</em>, <em>Social Content Search</em>, and <em>Social Collaborative Search</em>. For each subcategory, we present the key concepts and selected representative approaches in the literature in greater detail. We found that, up to now, a large body of studies model users’ preferences and their relations by simply combining social features made available by social platforms. It paves the way for significant research to exploit more structured information about users’ social profiles and behaviours (as they can be inferred from data available on social platforms) to optimize their information needs further.</p></div>","PeriodicalId":52228,"journal":{"name":"Online Social Networks and Media","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49888616","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-07-01DOI: 10.1016/j.osnem.2023.100250
Peiling Yi, Arkaitz Zubiaga
Cyberbullying is a pervasive problem in online social media, where a bully abuses a victim through a social media session. By investigating cyberbullying perpetrated through social media sessions, recent research has looked into mining patterns and features for modelling and understanding the two defining characteristics of cyberbullying: repetitive behaviour and power imbalance. In this survey paper, we define a framework that encapsulates four different steps session-based cyberbullying detection should go through, and discuss the multiple challenges that differ from single text-based cyberbullying detection. Based on this framework, we provide a comprehensive overview of session-based cyberbullying detection in social media, delving into existing efforts from a data and methodological perspective. Our review leads us to proposing evidence-based criteria for a set of best practices to create session-based cyberbullying datasets. In addition, we perform benchmark experiments comparing the performance of state-of-the-art session-based cyberbullying detection models as well as large pre-trained language models across two different datasets. Through our review, we also put forth a set of open challenges as future research directions.
{"title":"Session-based cyberbullying detection in social media: A survey","authors":"Peiling Yi, Arkaitz Zubiaga","doi":"10.1016/j.osnem.2023.100250","DOIUrl":"https://doi.org/10.1016/j.osnem.2023.100250","url":null,"abstract":"<div><p>Cyberbullying is a pervasive problem in online social media, where a bully abuses a victim through a social media session. By investigating cyberbullying perpetrated through social media sessions, recent research has looked into mining patterns and features for modelling and understanding the two defining characteristics of cyberbullying: repetitive behaviour and power imbalance. In this survey paper, we define a framework that encapsulates four different steps session-based cyberbullying detection should go through, and discuss the multiple challenges that differ from single text-based cyberbullying detection. Based on this framework, we provide a comprehensive overview of session-based cyberbullying detection in social media, delving into existing efforts from a data and methodological perspective. Our review leads us to proposing evidence-based criteria for a set of best practices to create session-based cyberbullying datasets. In addition, we perform benchmark experiments comparing the performance of state-of-the-art session-based cyberbullying detection models as well as large pre-trained language models across two different datasets. Through our review, we also put forth a set of open challenges as future research directions.</p></div>","PeriodicalId":52228,"journal":{"name":"Online Social Networks and Media","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49888996","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-07-01DOI: 10.1016/j.osnem.2023.100252
Mauro Conti, Luca Pajola, Pier Paolo Tricomi
Nowadays, people generate and share massive amounts of content on online platforms (e.g., social networks, blogs). In 2021, the 1.9 billion daily active Facebook users posted around 150 thousand photos every minute. Content moderators constantly monitor these online platforms to prevent the spreading of inappropriate content (e.g., hate speech, nudity images). Based on deep learning (DL) advances, Automatic Content Moderators (ACM) help human moderators handle high data volume. Despite their advantages, attackers can exploit weaknesses of DL components (e.g., preprocessing, model) to affect their performance. Therefore, an attacker can leverage such techniques to spread inappropriate content by evading ACM.
In this work, we analyzed 4600 potentially toxic Instagram posts, and we discovered that 44% of them adopt obfuscations that might undermine ACM. As these posts are reminiscent of captchas (i.e., not understandable by automated mechanisms), we coin this threat as Captcha Attack (). Our contributions start by proposing a taxonomy to better understand how ACM is vulnerable to obfuscation attacks. We then focus on the broad sub-category of using textual Captcha Challenges, namely CC-CAPA, and we empirically demonstrate that it evades real-world ACM (i.e., Amazon, Google, Microsoft) with 100% accuracy. Our investigation revealed that ACM failures are caused by the OCR text extraction phase. The training of OCRs to withstand such obfuscation is therefore crucial, but huge amounts of data are required. Thus, we investigate methods to identify CC-CAPA samples from large sets of data (originated by three OSN – Pinterest, Twitter, Yahoo-Flickr), and we empirically demonstrate that supervised techniques identify target styles of samples almost perfectly. Unsupervised solutions, on the other hand, represent a solid methodology for inspecting uncommon data to detect new obfuscation techniques.
{"title":"Turning captchas against humanity: Captcha-based attacks in online social media","authors":"Mauro Conti, Luca Pajola, Pier Paolo Tricomi","doi":"10.1016/j.osnem.2023.100252","DOIUrl":"https://doi.org/10.1016/j.osnem.2023.100252","url":null,"abstract":"<div><p>Nowadays, people generate and share massive amounts of content on online platforms (e.g., social networks, blogs). In 2021, the 1.9 billion daily active Facebook users posted around 150 thousand photos every minute. Content moderators constantly monitor these online platforms to prevent the spreading of inappropriate content (e.g., hate speech, nudity images). Based on deep learning (DL) advances, Automatic Content Moderators (ACM) help human moderators handle high data volume. Despite their advantages, attackers can exploit weaknesses of DL components (e.g., preprocessing, model) to affect their performance. Therefore, an attacker can leverage such techniques to spread inappropriate content by evading ACM.</p><p>In this work, we analyzed 4600 potentially toxic Instagram posts, and we discovered that 44% of them adopt obfuscations that might undermine ACM. As these posts are reminiscent of captchas (i.e., not understandable by automated mechanisms), we coin this threat as Captcha Attack (<span><math><mrow><mi>C</mi><mi>A</mi><mi>P</mi><mi>A</mi></mrow></math></span>). Our contributions start by proposing a <span><math><mrow><mi>C</mi><mi>A</mi><mi>P</mi><mi>A</mi></mrow></math></span> taxonomy to better understand how ACM is vulnerable to obfuscation attacks. We then focus on the broad sub-category of <span><math><mrow><mi>C</mi><mi>A</mi><mi>P</mi><mi>A</mi></mrow></math></span> using textual Captcha Challenges, namely <span>CC-CAPA</span>, and we empirically demonstrate that it evades real-world ACM (i.e., Amazon, Google, Microsoft) with 100% accuracy. Our investigation revealed that ACM failures are caused by the OCR text extraction phase. The training of OCRs to withstand such obfuscation is therefore crucial, but huge amounts of data are required. Thus, we investigate methods to identify <span>CC-CAPA</span> samples from large sets of data (originated by three OSN – Pinterest, Twitter, Yahoo-Flickr), and we empirically demonstrate that supervised techniques identify target styles of samples almost perfectly. Unsupervised solutions, on the other hand, represent a solid methodology for inspecting uncommon data to detect new obfuscation techniques.</p></div>","PeriodicalId":52228,"journal":{"name":"Online Social Networks and Media","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49888997","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}