首页 > 最新文献

Online Social Networks and Media最新文献

英文 中文
Localization of Unidentified Events with Raw Microblogging Data 基于微博原始数据的未知事件定位
Q1 Social Sciences Pub Date : 2022-05-01 DOI: 10.1016/j.osnem.2022.100209
Usman Anjum, Vladimir Zadorozhny, Prashant Krishnamurthy

Event localization is the task of finding the location of an event. Commonly, event localization using microblogging services, like Twitter, use con- tents of the messages and the geographical information associated with the messages. In this paper, we propose a novel approach called SPARE (SPAtial REconstruction) that bypasses the need for geographical or semantic information to localize tweets. We assume there are reference coordinates at known locations that scrape the microblog (tweet) counts in time and space (circular regions around the reference coordinate). The counts of tweets are aggregated which are then disaggregated to identify event patterns. The change in counts of tweets would be indicative of an event pattern. We show, using real data, that the change in counts of tweets is manifested as peaks. The peaks from multiple reference coordinates can be used as an input to trilateration techniques to pinpoint the location of an event. We introduce metrics to identify the quality of disaggregation of fine-grained data and examine techniques like filtering to improve accuracy of event location. The experimental results show that our method can identify the location of an event with high accuracy.

事件本地化是查找事件位置的任务。通常,使用微博服务(如Twitter)的事件本地化使用消息的内容和与消息相关的地理信息。在本文中,我们提出了一种名为SPARE (SPAtial REconstruction)的新方法,该方法绕过了对地理或语义信息的需求来定位推文。我们假设在已知位置存在参考坐标,这些参考坐标在时间和空间上抓取微博(tweet)计数(参考坐标周围的圆形区域)。tweet的计数被聚合,然后被分解以识别事件模式。tweet计数的变化将指示事件模式。我们使用真实数据显示,推文数量的变化表现为峰值。来自多个参考坐标的峰值可以用作三边测量技术的输入,以确定事件的位置。我们引入了度量来识别细粒度数据分解的质量,并研究了过滤等技术来提高事件定位的准确性。实验结果表明,该方法能较准确地识别出事件的位置。
{"title":"Localization of Unidentified Events with Raw Microblogging Data","authors":"Usman Anjum,&nbsp;Vladimir Zadorozhny,&nbsp;Prashant Krishnamurthy","doi":"10.1016/j.osnem.2022.100209","DOIUrl":"10.1016/j.osnem.2022.100209","url":null,"abstract":"<div><p><span><span>Event localization is the task of finding the location of an event. Commonly, event localization using microblogging services, like Twitter, use con- tents of the messages and the </span>geographical information<span> associated with the messages. In this paper, we propose a novel approach called SPARE (SPAtial REconstruction) that bypasses the need for geographical or semantic information to localize tweets. We assume there are reference coordinates at known locations that scrape the microblog (tweet) counts in time and space (circular regions around the reference coordinate). The counts of tweets are aggregated which are then disaggregated to identify event patterns. The change in counts of tweets would be indicative of an event pattern. We show, using real data, that the change in counts of tweets is manifested as peaks. The peaks from multiple reference coordinates can be used as an input to </span></span>trilateration techniques to pinpoint the location of an event. We introduce metrics to identify the quality of disaggregation of fine-grained data and examine techniques like filtering to improve accuracy of event location. The experimental results show that our method can identify the location of an event with high accuracy.</p></div>","PeriodicalId":52228,"journal":{"name":"Online Social Networks and Media","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128221538","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Rumour spread minimization in social networks: A source-ignorant approach 社交网络中的谣言传播最小化:一种不了解来源的方法
Q1 Social Sciences Pub Date : 2022-05-01 DOI: 10.1016/j.osnem.2022.100206
Ahmad Zareie, Rizos Sakellariou

The spread of rumours in social networks has become a significant challenge in recent years. Blocking so-called critical edges, that is, edges that have a significant role in the spreading process, has attracted lots of attention as a means to minimize the spread of rumours. Although the detection of the sources of rumour may help identify critical edges this has an overhead that source-ignorant approaches are trying to eliminate. Several source-ignorant edge blocking methods have been proposed which mostly determine critical edges on the basis of centrality. Taking into account additional features of edges (beyond centrality) may help determine what edges to block more accurately. In this paper, a new source-ignorant method is proposed to identify a set of critical edges by considering for each edge the impact of blocking and the influence of the nodes connected to the edge. Experimental results demonstrate that the proposed method can identify critical edges more accurately in comparison to other source-ignorant methods.

近年来,谣言在社交网络上的传播已成为一个重大挑战。封锁所谓的临界边缘,即在传播过程中起重要作用的边缘,作为最小化谣言传播的一种手段,已经引起了很多关注。尽管对谣言来源的检测可能有助于确定关键边缘,但这有一个开销,无来源方法正在试图消除。提出了几种无源边缘阻塞方法,它们大多是基于中心性来确定临界边缘。考虑边缘的附加特征(除了中心性)可能有助于更准确地确定要阻塞哪些边缘。本文提出了一种新的无源边缘识别方法,该方法考虑了每条边缘的阻塞影响和与边缘相连的节点的影响。实验结果表明,与其他无源方法相比,该方法可以更准确地识别临界边缘。
{"title":"Rumour spread minimization in social networks: A source-ignorant approach","authors":"Ahmad Zareie,&nbsp;Rizos Sakellariou","doi":"10.1016/j.osnem.2022.100206","DOIUrl":"10.1016/j.osnem.2022.100206","url":null,"abstract":"<div><p>The spread of rumours in social networks has become a significant challenge in recent years. Blocking so-called critical edges, that is, edges that have a significant role in the spreading process, has attracted lots of attention as a means to minimize the spread of rumours. Although the detection of the sources of rumour may help identify critical edges this has an overhead that source-ignorant approaches are trying to eliminate. Several source-ignorant edge blocking methods have been proposed which mostly determine critical edges on the basis of centrality. Taking into account additional features of edges (beyond centrality) may help determine what edges to block more accurately. In this paper, a new source-ignorant method is proposed to identify a set of critical edges by considering for each edge the impact of blocking and the influence of the nodes connected to the edge. Experimental results demonstrate that the proposed method can identify critical edges more accurately in comparison to other source-ignorant methods.</p></div>","PeriodicalId":52228,"journal":{"name":"Online Social Networks and Media","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2468696422000106/pdfft?md5=5c46e8ade686686c561918b3c01408b9&pid=1-s2.0-S2468696422000106-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130196186","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Selecting and combining complementary feature representations and classifiers for hate speech detection 选择和组合互补特征表示和分类器用于仇恨语音检测
Q1 Social Sciences Pub Date : 2022-03-01 DOI: 10.1016/j.osnem.2021.100194
Rafael M.O. Cruz , Woshington V. de Sousa , George D.C. Cavalcanti

Hate speech is a major issue in social networks due to the high volume of data generated daily. Recent works demonstrate the usefulness of machine learning (ML) in dealing with the nuances required to distinguish between hateful posts from just sarcasm or offensive language. Many ML solutions for hate speech detection have been proposed by either changing how features are extracted from the text or the classification algorithm employed. However, most works consider only one type of feature extraction and classification algorithm. This work argues that a combination of multiple feature extraction techniques and different classification models is needed. We propose a framework to analyze the relationship between multiple feature extraction and classification techniques to understand how they complement each other. The framework is used to select a subset of complementary techniques to compose a robust multiple classifiers system (MCS) for hate speech detection. The experimental study considering four hate speech classification datasets demonstrates that the proposed framework is a promising methodology for analyzing and designing high-performing MCS for this task. MCS system obtained using the proposed framework significantly outperforms the combination of all models and the homogeneous and heterogeneous selection heuristics, demonstrating the importance of having a proper selection scheme. Source code, figures and dataset splits can be found in the GitHub repository: https://github.com/Menelau/Hate-Speech-MCS.

由于每天产生的大量数据,仇恨言论是社交网络中的一个主要问题。最近的研究表明,机器学习(ML)在处理区分仇恨帖子与讽刺或攻击性语言所需的细微差别方面非常有用。许多仇恨言论检测的机器学习解决方案都是通过改变从文本中提取特征的方式或采用分类算法来提出的。然而,大多数工作只考虑了一种特征提取和分类算法。本文认为,需要多种特征提取技术和不同的分类模型相结合。我们提出了一个框架来分析多种特征提取和分类技术之间的关系,以了解它们如何相互补充。该框架用于选择互补技术的子集,组成一个鲁棒的多分类器系统(MCS)用于仇恨言论检测。基于四个仇恨言论分类数据集的实验研究表明,所提出的框架是分析和设计高性能MCS的一种很有前途的方法。使用该框架获得的MCS系统显著优于所有模型和同质和异质选择启发式的组合,证明了选择方案的重要性。源代码、图表和数据集拆分可以在GitHub存储库中找到:https://github.com/Menelau/Hate-Speech-MCS。
{"title":"Selecting and combining complementary feature representations and classifiers for hate speech detection","authors":"Rafael M.O. Cruz ,&nbsp;Woshington V. de Sousa ,&nbsp;George D.C. Cavalcanti","doi":"10.1016/j.osnem.2021.100194","DOIUrl":"https://doi.org/10.1016/j.osnem.2021.100194","url":null,"abstract":"<div><p><span><span>Hate speech is a major issue in social networks due to the high volume of data generated daily. Recent works demonstrate the usefulness of machine learning (ML) in dealing with the nuances required to distinguish between hateful posts from just sarcasm or offensive language. Many ML solutions for hate speech detection have been proposed by either changing how features are extracted from the text or the </span>classification algorithm<span><span><span> employed. However, most works consider only one type of feature extraction and classification algorithm. This work argues that a combination of multiple feature extraction techniques and different classification models is needed. We propose a framework to analyze the relationship between multiple feature extraction and </span>classification techniques to understand how they complement each other. The framework is used to select a subset of complementary techniques to compose a robust </span>multiple classifiers system<span> (MCS) for hate speech detection. The experimental study considering four hate speech classification datasets demonstrates that the proposed framework is a promising methodology for analyzing and designing high-performing MCS for this task. MCS system obtained using the proposed framework significantly outperforms the combination of all models and the homogeneous and heterogeneous selection heuristics, demonstrating the importance of having a proper selection scheme. Source code, figures and dataset splits can be found in the GitHub repository: </span></span></span><span>https://github.com/Menelau/Hate-Speech-MCS</span><svg><path></path></svg>.</p></div>","PeriodicalId":52228,"journal":{"name":"Online Social Networks and Media","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91737144","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Contact duration: Intricacies of human mobility 接触时间:人类流动性的复杂性
Q1 Social Sciences Pub Date : 2022-03-01 DOI: 10.1016/j.osnem.2021.100196
Leonardo Tonetto , Malintha Adikari , Nitinder Mohan , Aaron Yi Ding , Jörg Ott

Human mobility shapes our daily lives, our urban environment and even the trajectory of a global pandemic. While various aspects of human mobility and inter-personal contact duration have already been studied separately, little is known about how these two key aspects of our daily lives are fundamentally connected. Better understanding of such interconnected human behaviors is crucial for studying infectious diseases, as well as opportunistic content forwarding. To address these deficiencies, we conducted a study on a mobile social network of human mobility and contact duration, using data from 71 persons based on GPS and Bluetooth logs for 2 months in 2018. We augment these data with location APIs, enabling a finer granular characterization of the users’ mobility in addition to contact patterns. We model stops durations to reveal how time-unbounded-stops (e.g., bars or restaurants) follow a log-normal distribution while time-bounded-stops (e.g., offices, hotels) follow a power-law distribution. Furthermore, our analysis reveals contact duration adheres to a log-normal distribution, which we use to model the duration of contacts as a function of the duration of stays. We further extend our understanding of contact duration during trips by modeling these times as a Weibull distribution whose parameters are a function of trip length. These results could better inform models for information or epidemic spreading, helping guide the future design of network protocols as well as policy decisions.

人类的流动性影响着我们的日常生活、城市环境,甚至影响着全球流行病的发展轨迹。虽然人们已经分别研究了人类流动性和人际接触持续时间的各个方面,但我们对日常生活中这两个关键方面是如何从根本上联系在一起的知之甚少。更好地了解这种相互关联的人类行为对于研究传染病以及机会主义内容转发至关重要。为了解决这些不足,我们在移动社交网络上进行了一项关于人类流动性和接触时间的研究,使用了71人的数据,基于2018年的GPS和蓝牙日志,为期2个月。我们使用位置api增强这些数据,除了联系模式之外,还可以对用户的移动性进行更细粒度的表征。我们对停车时间进行建模,以揭示无时间限制的停车(例如,酒吧或餐馆)是如何遵循对数正态分布的,而有时间限制的停车(例如,办公室、酒店)是如何遵循幂律分布的。此外,我们的分析显示,接触持续时间遵循对数正态分布,我们用它来模拟接触持续时间作为停留时间的函数。通过将这些时间建模为威布尔分布,其参数是旅行长度的函数,我们进一步扩展了对旅行期间接触持续时间的理解。这些结果可以更好地为信息或流行病传播模型提供信息,帮助指导未来网络协议的设计以及政策决策。
{"title":"Contact duration: Intricacies of human mobility","authors":"Leonardo Tonetto ,&nbsp;Malintha Adikari ,&nbsp;Nitinder Mohan ,&nbsp;Aaron Yi Ding ,&nbsp;Jörg Ott","doi":"10.1016/j.osnem.2021.100196","DOIUrl":"https://doi.org/10.1016/j.osnem.2021.100196","url":null,"abstract":"<div><p>Human mobility shapes our daily lives, our urban environment and even the trajectory of a global pandemic. While various aspects of human mobility and inter-personal contact duration have already been studied separately, little is known about how these two key aspects of our daily lives are fundamentally connected. Better understanding of such interconnected human behaviors is crucial for studying infectious diseases, as well as opportunistic content forwarding. To address these deficiencies, we conducted a study on a mobile social network of human mobility and contact duration, using data from 71 persons based on GPS and Bluetooth logs for 2 months in 2018. We augment these data with location APIs, enabling a finer granular characterization of the users’ mobility in addition to contact patterns. We model stops durations to reveal how time-unbounded-stops (<em>e.g.</em>, bars or restaurants) follow a log-normal distribution while time-bounded-stops (<em>e.g.</em>, offices, hotels) follow a power-law distribution. Furthermore, our analysis reveals contact duration adheres to a log-normal distribution, which we use to model the duration of contacts as a function of the duration of stays. We further extend our understanding of contact duration during trips by modeling these times as a Weibull distribution whose parameters are a function of trip length. These results could better inform models for information or epidemic spreading, helping guide the future design of network protocols as well as policy decisions.</p></div>","PeriodicalId":52228,"journal":{"name":"Online Social Networks and Media","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2468696421000720/pdfft?md5=3f4081e0dafc13110ea3b0ba03ef6285&pid=1-s2.0-S2468696421000720-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91696282","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Applying diffusion of innovations theory to social networks to understand the stages of adoption in connective action campaigns 将创新扩散理论应用于社交网络,以了解关联行动活动的采用阶段
Q1 Social Sciences Pub Date : 2022-03-01 DOI: 10.1016/j.osnem.2022.100201
Billy Spann , Esther Mead , Maryam Maleki , Nitin Agarwal , Therese Williams

This research proposes a conceptual framework for determining the adoption trajectory of information diffusion in connective action campaigns. This approach reveals whether an information campaign is accelerating, reached critical mass, or decelerating during its life cycle. The experimental approach taken in this study builds on the diffusion of innovations theory, critical mass theory, and previous s-shaped production function research to provide ideas for modeling future connective action campaigns. Most social science research on connective action has taken a qualitative approach. There are limited quantitative studies, but most focus on statistical validation of the qualitative approach, such as surveys, or only focus on one aspect of connective action. In this study, we extend the social science research on connective action theory by applying a mixed-method computational analysis to examine the affordances and features offered through online social networks (OSNs) and then present a new method to quantify the emergence of these action networks. Using the s-curves revealed through plotting the information campaigns usage, we apply a diffusion of innovations lens to the analysis to categorize users into different stages of adoption of information campaigns. We then categorize the users in each campaign by examining their affordance and interdependence relationships by assigning retweets, mentions, and original tweets to the type of relationship they exhibit. The contribution of this analysis provides a foundation for mathematical characterization of connective action signatures, and further, offers policymakers insights about campaigns as they evolve. To evaluate our framework, we present a comprehensive analysis of COVID-19 Twitter data. Establishing this theoretical framework will help researchers develop predictive models to more accurately model campaign dynamics.

本研究提出了一个概念性框架,用于确定关联行动运动中信息扩散的采用轨迹。这种方法揭示了信息活动在其生命周期中是在加速、达到临界质量还是在减速。本研究采用的实验方法建立在创新扩散理论、临界质量理论和先前的s型生产函数研究的基础上,为未来关联行动运动的建模提供思路。大多数社会科学对关联行为的研究都采取了定性方法。定量研究有限,但大多数集中在定性方法的统计验证上,如调查,或者只集中在连接作用的一个方面。在本研究中,我们通过应用混合方法计算分析来扩展连接行为理论的社会科学研究,以检查在线社交网络(OSNs)提供的功能和特征,然后提出一种量化这些行动网络出现的新方法。使用通过绘制信息活动使用情况所揭示的s曲线,我们将创新扩散透镜应用于分析,将用户划分为采用信息活动的不同阶段。然后,我们对每个活动中的用户进行分类,通过将转发、提及和原始tweet分配到他们所展示的关系类型来检查他们的可用性和相互依赖关系。这一分析的贡献为关联行动签名的数学特征提供了基础,并进一步为政策制定者提供了关于运动演变的见解。为了评估我们的框架,我们对COVID-19 Twitter数据进行了全面分析。建立这一理论框架将有助于研究人员开发预测模型,以更准确地模拟竞选动态。
{"title":"Applying diffusion of innovations theory to social networks to understand the stages of adoption in connective action campaigns","authors":"Billy Spann ,&nbsp;Esther Mead ,&nbsp;Maryam Maleki ,&nbsp;Nitin Agarwal ,&nbsp;Therese Williams","doi":"10.1016/j.osnem.2022.100201","DOIUrl":"https://doi.org/10.1016/j.osnem.2022.100201","url":null,"abstract":"<div><p><span>This research proposes a conceptual framework for determining the adoption trajectory of information diffusion in connective action campaigns. This approach reveals whether an information campaign is accelerating, reached critical mass, or decelerating during its life cycle. The experimental approach taken in this study builds on the diffusion of innovations theory, critical mass theory, and previous s-shaped production function research to provide ideas for modeling future connective action campaigns. Most social science research on connective action has taken a qualitative approach. There are limited quantitative studies, but most focus on statistical validation of the qualitative approach, such as surveys, or only focus on one aspect of connective action. In this study, we extend the social science research on connective action theory by applying a mixed-method computational analysis to examine the affordances and features offered through </span>online social networks (OSNs) and then present a new method to quantify the emergence of these action networks. Using the s-curves revealed through plotting the information campaigns usage, we apply a diffusion of innovations lens to the analysis to categorize users into different stages of adoption of information campaigns. We then categorize the users in each campaign by examining their affordance and interdependence relationships by assigning retweets, mentions, and original tweets to the type of relationship they exhibit. The contribution of this analysis provides a foundation for mathematical characterization of connective action signatures, and further, offers policymakers insights about campaigns as they evolve. To evaluate our framework, we present a comprehensive analysis of COVID-19 Twitter data. Establishing this theoretical framework will help researchers develop predictive models to more accurately model campaign dynamics.</p></div>","PeriodicalId":52228,"journal":{"name":"Online Social Networks and Media","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90019833","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Stabilizing a supervised bot detection algorithm: How much data is needed for consistent predictions? 稳定监督机器人检测算法:需要多少数据才能实现一致的预测?
Q1 Social Sciences Pub Date : 2022-03-01 DOI: 10.1016/j.osnem.2022.100198
Lynnette Hui Xian Ng, Dawn C. Robertson, Kathleen M. Carley

Social media bots have been characterized in their use in digital activism and information manipulation, due to their roles in information diffusion. The detection of bots has been a major task within the field of social media computation, and many datasets and bot detection algorithms have been developed. With these algorithms, the bot score stability is key in estimating the impact of bots on the diffusion of information. Within several experiments on Twitter agents, we quantify the amount of data required for consistent bot predictions and analyze agent bot classification behavior. Through this study, we developed a methodology to establish parameters for stabilizing the bot probability score through threshold, temporal and volume analysis, eventually quantifying suitable threshold values for bot classification (i.e. whether the agent is a bot or not) and reasonable data collection size (i.e. number of days of tweets or number of tweets) for stable scores and bot classification.

由于社交媒体机器人在信息传播中的作用,它们在数字行动主义和信息操纵中使用的特点。机器人的检测一直是社交媒体计算领域的一项主要任务,已经开发了许多数据集和机器人检测算法。在这些算法中,机器人得分的稳定性是评估机器人对信息传播影响的关键。在Twitter代理的几个实验中,我们量化了一致的bot预测所需的数据量,并分析了代理bot分类行为。通过本研究,我们开发了一种方法,通过阈值、时间和体积分析来建立稳定机器人概率得分的参数,最终量化出适合机器人分类的阈值(即代理是否是机器人)和合理的数据收集规模(即推文天数或推文数量),以稳定得分和机器人分类。
{"title":"Stabilizing a supervised bot detection algorithm: How much data is needed for consistent predictions?","authors":"Lynnette Hui Xian Ng,&nbsp;Dawn C. Robertson,&nbsp;Kathleen M. Carley","doi":"10.1016/j.osnem.2022.100198","DOIUrl":"https://doi.org/10.1016/j.osnem.2022.100198","url":null,"abstract":"<div><p>Social media bots have been characterized in their use in digital activism and information manipulation, due to their roles in information diffusion. The detection of bots has been a major task within the field of social media computation, and many datasets and bot detection algorithms have been developed. With these algorithms, the bot score stability is key in estimating the impact of bots on the diffusion of information. Within several experiments on Twitter agents, we quantify the amount of data required for consistent bot predictions and analyze agent bot classification behavior. Through this study, we developed a methodology to establish parameters for stabilizing the bot probability score through threshold, temporal and volume analysis, eventually quantifying suitable threshold values for bot classification (i.e. whether the agent is a bot or not) and reasonable data collection size (i.e. number of days of tweets or number of tweets) for stable scores and bot classification.</p></div>","PeriodicalId":52228,"journal":{"name":"Online Social Networks and Media","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2468696422000027/pdfft?md5=879d4a241d8634d464a12524eaf23546&pid=1-s2.0-S2468696422000027-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91696283","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
Erratum to Online Social Networks and information diffusion: The role of ego networks: Online Social Networks and Media, Volume 1 (June 2017), Pages 44-55 《在线社交网络和信息扩散:自我网络的作用:在线社交网络和媒体》,第1卷(2017年6月),44-55页
Q1 Social Sciences Pub Date : 2022-01-01 DOI: 10.1016/j.osnem.2021.100184
Valerio Arnaboldi , Marco Conti , Andrea Passarella , Robin I.M. Dunbar
{"title":"Erratum to Online Social Networks and information diffusion: The role of ego networks: Online Social Networks and Media, Volume 1 (June 2017), Pages 44-55","authors":"Valerio Arnaboldi ,&nbsp;Marco Conti ,&nbsp;Andrea Passarella ,&nbsp;Robin I.M. Dunbar","doi":"10.1016/j.osnem.2021.100184","DOIUrl":"10.1016/j.osnem.2021.100184","url":null,"abstract":"","PeriodicalId":52228,"journal":{"name":"Online Social Networks and Media","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2468696421000628/pdfft?md5=7b90bb651c421f310f601ebc13af3388&pid=1-s2.0-S2468696421000628-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131380779","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Understanding and identifying the use of emotes in toxic chat on Twitch 理解和识别在Twitch上的有毒聊天中表情符号的使用
Q1 Social Sciences Pub Date : 2022-01-01 DOI: 10.1016/j.osnem.2021.100180
Jaeheon Kim , Donghee Yvette Wohn , Meeyoung Cha

The latest advances in NLP (natural language processing) have led to the launch of the much needed machine-driven toxic chat detection. Nevertheless, people continuously find new forms of hateful expressions that are easily identified by humans, but not by machines. One such common expression is the mix of text and emotes, a type of visual toxic chat that is increasingly used to evade algorithmic moderation and a trend that is an under-studied aspect of the problem of online toxicity. This research analyzes chat conversations from the popular streaming platform Twitch to understand the varied types of visual toxic chat. Emotes were sometimes used to replace a letter, seek attention, or for emotional expression. We created a labeled dataset that contains 29,721 cases of emotes replacing letters. Based on the dataset, we built a neural network classifier and identified visual toxic chat that would otherwise be undetected through traditional methods and caught an additional 1.3% examples of toxic chat out of 15 million chat utterances.

NLP(自然语言处理)的最新进展导致了急需的机器驱动的有毒聊天检测的推出。然而,人们不断发现新的可恨的表达方式,这些表达方式很容易被人类识别,而不是被机器识别。其中一个常见的表达是文字和表情的混合,这是一种视觉上的有毒聊天,越来越多地用于逃避算法审核,这种趋势是网络毒性问题的一个未被充分研究的方面。本研究分析了流行流媒体平台Twitch的聊天对话,以了解各种类型的视觉有毒聊天。表情符号有时被用来代替信件、寻求关注或表达情感。我们创建了一个有标签的数据集,其中包含29,721个表情代替字母的案例。基于该数据集,我们构建了一个神经网络分类器,并识别了通过传统方法无法检测到的视觉有毒聊天,并从1500万条聊天话语中捕获了额外的1.3%的有毒聊天示例。
{"title":"Understanding and identifying the use of emotes in toxic chat on Twitch","authors":"Jaeheon Kim ,&nbsp;Donghee Yvette Wohn ,&nbsp;Meeyoung Cha","doi":"10.1016/j.osnem.2021.100180","DOIUrl":"10.1016/j.osnem.2021.100180","url":null,"abstract":"<div><p>The latest advances in NLP (natural language processing) have led to the launch of the much needed machine-driven toxic chat detection. Nevertheless, people continuously find new forms of hateful expressions that are easily identified by humans, but not by machines. One such common expression is the mix of text and emotes, a type of visual toxic chat that is increasingly used to evade algorithmic moderation and a trend that is an under-studied aspect of the problem of online toxicity. This research analyzes chat conversations from the popular streaming platform Twitch to understand the varied types of visual toxic chat. Emotes were sometimes used to replace a letter, seek attention, or for emotional expression. We created a labeled dataset that contains 29,721 cases of emotes replacing letters. Based on the dataset, we built a neural network classifier and identified visual toxic chat that would otherwise be undetected through traditional methods and caught an additional 1.3% examples of toxic chat out of 15 million chat utterances.</p></div>","PeriodicalId":52228,"journal":{"name":"Online Social Networks and Media","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2468696421000598/pdfft?md5=74d9b0d4cdd5859c36ea8a0c200c176d&pid=1-s2.0-S2468696421000598-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123624066","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
SWSR: A Chinese dataset and lexicon for online sexism detection 网络性别歧视检测的中文数据集和词典
Q1 Social Sciences Pub Date : 2022-01-01 DOI: 10.1016/j.osnem.2021.100182
Aiqi Jiang , Xiaohan Yang , Yang Liu , Arkaitz Zubiaga

Online sexism has become an increasing concern in social media platforms as it has affected the healthy development of the Internet and can have negative effects in society. While research in the sexism detection domain is growing, most of this research focuses on English as the language and on Twitter as the platform. Our objective here is to broaden the scope of this research by considering the Chinese language on Sina Weibo. We propose the first Chinese sexism dataset – Sina Weibo Sexism Review (SWSR) dataset –, as well as a large Chinese lexicon SexHateLex made of abusive and gender-related terms. We introduce our data collection and annotation process, and provide an exploratory analysis of the dataset characteristics to validate its quality and to show how sexism is manifested in Chinese. The SWSR dataset provides labels at different levels of granularity including (i) sexism or non-sexism, (ii) sexism category and (iii) target type, which can be exploited, among others, for building computational methods to identify and investigate finer-grained gender-related abusive language. We conduct experiments for the three sexism classification tasks making use of state-of-the-art machine learning models. Our results show competitive performance, providing a benchmark for sexism detection in the Chinese language, as well as an error analysis highlighting open challenges needing more research in Chinese NLP. The SWSR dataset and SexHateLex lexicon are publicly available.1

网络性别歧视已经成为社交媒体平台日益关注的问题,因为它影响了互联网的健康发展,并可能对社会产生负面影响。虽然性别歧视检测领域的研究正在增长,但大多数研究都集中在英语作为语言和Twitter作为平台上。我们的目标是通过考虑新浪微博上的中文来扩大这项研究的范围。我们提出了第一个中文性别歧视数据集——新浪微博性别歧视评论(SWSR)数据集——以及一个由辱骂和性别相关术语组成的大型中文词汇SexHateLex。我们介绍了我们的数据收集和注释过程,并对数据集特征进行了探索性分析,以验证其质量,并展示性别歧视在中文中的表现。SWSR数据集提供了不同粒度级别的标签,包括(i)性别歧视或非性别歧视,(ii)性别歧视类别和(iii)目标类型,这些标签可以用于构建计算方法,以识别和调查更细粒度的与性别相关的辱骂语言。我们利用最先进的机器学习模型对三个性别歧视分类任务进行了实验。我们的研究结果显示了具有竞争力的表现,为汉语中的性别歧视检测提供了基准,同时也为汉语NLP中需要更多研究的开放挑战提供了错误分析。SWSR数据集和SexHateLex词典是公开可用的
{"title":"SWSR: A Chinese dataset and lexicon for online sexism detection","authors":"Aiqi Jiang ,&nbsp;Xiaohan Yang ,&nbsp;Yang Liu ,&nbsp;Arkaitz Zubiaga","doi":"10.1016/j.osnem.2021.100182","DOIUrl":"10.1016/j.osnem.2021.100182","url":null,"abstract":"<div><p><span>Online sexism has become an increasing concern in social media platforms<span> as it has affected the healthy development of the Internet and can have negative effects in society. While research in the sexism detection domain is growing, most of this research focuses on English as the language and on Twitter as the platform. Our objective here is to broaden the scope of this research by considering the Chinese language on Sina Weibo. We propose the first Chinese sexism dataset – Sina Weibo Sexism Review (SWSR) dataset –, as well as a large Chinese lexicon SexHateLex made of abusive and gender-related terms. We introduce our data collection and annotation process, and provide an exploratory analysis of the dataset characteristics to validate its quality and to show how sexism is manifested in Chinese. The SWSR dataset provides labels at different levels of granularity<span><span> including (i) sexism or non-sexism, (ii) sexism category and (iii) target type, which can be exploited, among others, for building computational methods to identify and investigate finer-grained gender-related abusive language. We conduct experiments for the three sexism classification tasks making use of state-of-the-art </span>machine learning models. Our results show competitive performance, providing a benchmark for sexism detection in the Chinese language, as well as an error analysis highlighting open challenges needing more research in Chinese NLP. The SWSR dataset and SexHateLex lexicon are publicly available.</span></span></span><span><sup>1</sup></span></p></div>","PeriodicalId":52228,"journal":{"name":"Online Social Networks and Media","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124038451","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 31
Understanding the characteristics of COVID-19 misinformation communities through graphlet analysis 通过graphlet分析了解新冠病毒误传社群特征
Q1 Social Sciences Pub Date : 2022-01-01 DOI: 10.1016/j.osnem.2021.100178
James R. Ashford , Liam D. Turner , Roger M. Whitaker , Alun Preece , Diane Felmlee

Online social networks serve as a convenient way to connect, share, and promote content with others. As a result, these networks can be used with malicious intent, causing disruption and harm to public debate through the sharing of misinformation. However, automatically identifying such content through its use of natural language is a significant challenge compared to our solution which uses less computational resources, language-agnostic and without the need for complex semantic analysis. Consequently alternative and complementary approaches are highly valuable. In this paper, we assess content that has the potential for misinformation and focus on patterns of user association with online social media communities (subreddits) in the popular Reddit social media platform, and generate networks of behaviour capturing user interaction with different subreddits. We examine these networks using both global and local metrics, in particular noting the presence of induced substructures (graphlets) assessing 7,876,064 posts from 96,634 users. From subreddits identified as having potential for misinformation, we note that the associated networks have strongly defined local features relating to node degree — these are evident both from analysis of dominant graphlets and degree-related global metrics. We find that these local features support high accuracy classification of subreddits that are categorised as having the potential for misinformation. Consequently we observe that induced local substructures of high degree are fundamental metrics for subreddit classification, and support automatic detection capabilities for online misinformation independent from any particular language.

在线社交网络是与他人联系、分享和推广内容的便捷方式。因此,这些网络可以被恶意利用,通过分享错误信息对公众辩论造成破坏和伤害。然而,与我们的解决方案相比,通过使用自然语言来自动识别这些内容是一个重大挑战,我们的解决方案使用较少的计算资源,与语言无关,不需要复杂的语义分析。因此,替代和补充方法是非常有价值的。在本文中,我们评估了可能存在错误信息的内容,并关注了流行的Reddit社交媒体平台中用户与在线社交媒体社区(子Reddit)的关联模式,并生成了捕获用户与不同子Reddit互动的行为网络。我们使用全局和局部指标来检查这些网络,特别注意到诱导子结构(石墨)的存在,评估了来自96,634名用户的7,876,064个帖子。从被识别为具有潜在错误信息的子reddit中,我们注意到相关网络具有与节点度相关的强烈定义的局部特征——这些特征从主导石墨烯和与度相关的全局指标的分析中都很明显。我们发现这些局部特征支持对被分类为具有错误信息潜力的子reddit进行高精度分类。因此,我们观察到高程度的诱导局部子结构是subreddit分类的基本指标,并且支持独立于任何特定语言的在线错误信息自动检测能力。
{"title":"Understanding the characteristics of COVID-19 misinformation communities through graphlet analysis","authors":"James R. Ashford ,&nbsp;Liam D. Turner ,&nbsp;Roger M. Whitaker ,&nbsp;Alun Preece ,&nbsp;Diane Felmlee","doi":"10.1016/j.osnem.2021.100178","DOIUrl":"10.1016/j.osnem.2021.100178","url":null,"abstract":"<div><p>Online social networks serve as a convenient way to connect, share, and promote content with others. As a result, these networks can be used with malicious intent, causing disruption and harm to public debate through the sharing of misinformation. However, automatically identifying such content through its use of natural language is a significant challenge compared to our solution which uses less computational resources, language-agnostic and without the need for complex semantic analysis. Consequently alternative and complementary approaches are highly valuable. In this paper, we assess content that has the potential for misinformation and focus on patterns of user association with online social media communities (subreddits) in the popular Reddit social media platform, and generate networks of behaviour capturing user interaction with different subreddits. We examine these networks using both global and local metrics, in particular noting the presence of induced substructures (graphlets) assessing <span><math><mrow><mn>7</mn><mo>,</mo><mn>876</mn><mo>,</mo><mn>064</mn></mrow></math></span> posts from 96,634 users. From subreddits identified as having potential for misinformation, we note that the associated networks have strongly defined local features relating to node degree — these are evident both from analysis of dominant graphlets and degree-related global metrics. We find that these local features support high accuracy classification of subreddits that are categorised as having the potential for misinformation. Consequently we observe that induced local substructures of high degree are fundamental metrics for subreddit classification, and support automatic detection capabilities for online misinformation independent from any particular language.</p></div>","PeriodicalId":52228,"journal":{"name":"Online Social Networks and Media","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2468696421000586/pdfft?md5=7bf5933a81760cdedf22974545a1b7e2&pid=1-s2.0-S2468696421000586-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115540283","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
期刊
Online Social Networks and Media
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1