Pub Date : 2022-07-01DOI: 10.1016/j.osnem.2022.100220
Lucio La Cava, Sergio Greco, Andrea Tagarelli
Decentralized Online Social Networks (DOSNs) represent a growing trend in the social media landscape, as opposed to the well-known centralized peers, which are often in the spotlight due to privacy concerns and a vision typically focused on monetization through user relationships. By exploiting open-source software, DOSNs allow users to create their own servers, or instances, thus favoring the proliferation of platforms that are independent yet interconnected with each other in a transparent way. Nonetheless, the resulting cooperation model, commonly known as the Fediverse, still represents a world to be fully discovered, since existing studies have mainly focused on a limited number of structural aspects of interest in DOSNs.
In this work, we aim to fill a lack of study on user relations and roles in DOSNs, by taking two main actions: understanding the impact of decentralization on how users relate to each other within their membership instance and/or across different instances, and unveiling user roles that can explain two interrelated axes of social behavioral phenomena, namely information consumption and boundary spanning. To this purpose, we build our analysis on user networks from Mastodon, since it represents the most widely used DOSN platform. We believe that the findings drawn from our study on Mastodon users’ roles and information flow can pave a way for further development of fascinating research on DOSNs.
分散式在线社交网络(Decentralized Online Social Networks,简称dosn)代表了社交媒体领域的一种增长趋势,与众所周知的中心化社交网络相反,中心化社交网络往往因为隐私问题和通过用户关系实现盈利的愿景而受到关注。通过利用开源软件,dosn允许用户创建自己的服务器或实例,从而有利于以透明的方式相互连接的独立平台的扩散。尽管如此,由此产生的合作模式,通常被称为Fediverse,仍然代表着一个有待充分发现的世界,因为现有的研究主要集中在对dosn感兴趣的有限数量的结构方面。在这项工作中,我们的目标是通过采取两项主要行动来填补dosn中用户关系和角色研究的不足:理解去中心化对用户在其成员实例内和/或跨不同实例之间如何相互关联的影响,并揭示可以解释两个相互关联的社会行为现象轴的用户角色,即信息消费和边界跨越。为此,我们在Mastodon的用户网络上进行分析,因为它代表了最广泛使用的DOSN平台。我们相信从乳齿象用户的角色和信息流的研究中得出的发现可以为进一步发展令人着迷的dosn研究铺平道路。
{"title":"Information consumption and boundary spanning in Decentralized Online Social Networks: The case of Mastodon users","authors":"Lucio La Cava, Sergio Greco, Andrea Tagarelli","doi":"10.1016/j.osnem.2022.100220","DOIUrl":"https://doi.org/10.1016/j.osnem.2022.100220","url":null,"abstract":"<div><p>Decentralized Online Social Networks<span><span> (DOSNs) represent a growing trend in the social media landscape, as opposed to the well-known centralized peers, which are often in the spotlight due to privacy concerns and a vision typically focused on monetization through user relationships. By exploiting open-source software, DOSNs allow users to create their own servers, or instances, thus favoring the proliferation of platforms that are independent yet interconnected with each other in a transparent way. Nonetheless, the resulting </span>cooperation model, commonly known as the Fediverse, still represents a world to be fully discovered, since existing studies have mainly focused on a limited number of structural aspects of interest in DOSNs.</span></p><p>In this work, we aim to fill a lack of study on user relations and roles in DOSNs, by taking two main actions: understanding the impact of decentralization on how users relate to each other within their membership instance and/or across different instances, and unveiling user roles that can explain two interrelated axes of social behavioral phenomena, namely information consumption and boundary spanning. To this purpose, we build our analysis on user networks from Mastodon, since it represents the most widely used DOSN platform. We believe that the findings drawn from our study on Mastodon users’ roles and information flow can pave a way for further development of fascinating research on DOSNs.</p></div>","PeriodicalId":52228,"journal":{"name":"Online Social Networks and Media","volume":"30 ","pages":"Article 100220"},"PeriodicalIF":0.0,"publicationDate":"2022-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91623858","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-07-01DOI: 10.1016/j.osnem.2022.100207
Mustafa Toprak, Chiara Boldrini, Andrea Passarella, Marco Conti
Ego networks have proved to be a valuable tool for understanding the relationships that individuals establish with their peers, both in offline and online social networks. Particularly interesting are the cognitive constraints associated with the interactions between the ego and the members of their ego network, which limit individuals to maintain meaningful interactions with no more than 150 people, on average, and to arrange such relationships along concentric circles of decreasing engagement. In this work, we focus on the ego networks of journalists on Twitter, considering 17 different countries, and we investigate whether they feature the same characteristics observed for other relevant classes of Twitter users, like politicians and generic users. Our findings are that journalists are generally more active and interact with more people than generic users, regardless of their country. Their ego network structure is very aligned with reference models derived in anthropology and observed in general human ego networks. Remarkably, the similarity is even higher than the one of politicians and generic users ego networks. This may imply a greater cognitive involvement with Twitter for journalists than for other user categories. From a dynamic perspective, journalists have stable short-term relationships that do not change much over time. In the longer term, though, ego networks can be pretty dynamic, especially in the innermost circles. Moreover, the ego-alter ties of journalists are often information-driven, as they are mediated by hashtags both at their inception and during their lifetime. Finally, we found that relationships between journalists are assortative in popularity: journalists tend to engage with other journalists of similar popularity, in all layers but especially in their innermost ones. Instead, when journalists interact with generic users, this assortativity is only present in the innermost layers.
{"title":"Journalists’ ego networks in Twitter: Invariant and distinctive structural features","authors":"Mustafa Toprak, Chiara Boldrini, Andrea Passarella, Marco Conti","doi":"10.1016/j.osnem.2022.100207","DOIUrl":"10.1016/j.osnem.2022.100207","url":null,"abstract":"<div><p><span>Ego networks have proved to be a valuable tool for understanding the relationships that individuals establish with their peers, both in offline and online social networks. Particularly interesting are the </span><em>cognitive constraints</em><span> associated with the interactions between the ego and the members of their ego network, which limit individuals to maintain meaningful interactions with no more than 150 people, on average, and to arrange such relationships along concentric circles of decreasing engagement. In this work, we focus on the ego networks of journalists on Twitter, considering 17 different countries, and we investigate whether they feature the same characteristics observed for other relevant classes of Twitter users, like politicians and generic users. Our findings are that journalists are generally more active and interact with more people than generic users, regardless of their country. Their ego network structure is very aligned with reference models derived in anthropology and observed in general human ego networks. Remarkably, the similarity is even higher than the one of politicians and generic users ego networks. This may imply a greater cognitive involvement with Twitter for journalists than for other user categories. From a dynamic perspective, journalists have stable short-term relationships that do not change much over time. In the longer term, though, ego networks can be pretty dynamic, especially in the innermost circles. Moreover, the ego-alter ties of journalists are often information-driven, as they are mediated by hashtags both at their inception and during their lifetime. Finally, we found that relationships between journalists are assortative in popularity: journalists tend to engage with other journalists of similar popularity, in all layers but especially in their innermost ones. Instead, when journalists interact with generic users, this assortativity is only present in the innermost layers.</span></p></div>","PeriodicalId":52228,"journal":{"name":"Online Social Networks and Media","volume":"30 ","pages":"Article 100207"},"PeriodicalIF":0.0,"publicationDate":"2022-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125330410","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-05-01DOI: 10.1016/j.osnem.2022.100199
Pantelis Agathangelou, Ioannis Katakis
Sentiment analysis is a fast-accelerating discipline that develops algorithms for knowledge discovery from opinionated content. The challenges however, when it comes to analyzing user reviews are plenty. Bad-quality, informal use of language and lack of labels, are only a few obstacles. Most importantly, users, consciously or subconsciously, use different approaches for expressing their opinion about a product or a service. Some of them go sentence by sentence mentioning some positive and negative aspects whereas others provide a mixed piece of text where the reader is supposed to see the big picture to understand the message. In this work, we propose a novel neural network that deals with both situations. Our method, by combining convolutional, recurrent and attention neural networks can extract rich linguistic patterns that reveal the user’s sentiment towards the entity under review. We evaluate our method in nine datasets that represent both binary and multi-class classification tasks. Experimental evaluation indicates that our method outperforms well-established deep learning approaches. Our approach outperformed the competitive methods in 8 out of 9 cases.
{"title":"Balancing between holistic and cumulative sentiment classification","authors":"Pantelis Agathangelou, Ioannis Katakis","doi":"10.1016/j.osnem.2022.100199","DOIUrl":"10.1016/j.osnem.2022.100199","url":null,"abstract":"<div><p>Sentiment analysis<span><span> is a fast-accelerating discipline that develops algorithms for knowledge discovery from opinionated content. The challenges however, when it comes to analyzing user reviews are plenty. Bad-quality, informal use of language and lack of labels, are only a few obstacles. Most importantly, users, consciously or subconsciously, use different approaches for expressing their opinion about a product or a service. Some of them go sentence by sentence mentioning some positive and negative aspects whereas others provide a mixed piece of text where the reader is supposed to see the big picture to understand the message. In this work, we propose a novel neural network that deals with both situations. Our method, by combining convolutional, </span>recurrent<span> and attention neural networks can extract rich linguistic patterns that reveal the user’s sentiment towards the entity under review. We evaluate our method in nine datasets that represent both binary and multi-class classification tasks<span>. Experimental evaluation indicates that our method outperforms well-established deep learning approaches. Our approach outperformed the competitive methods in 8 out of 9 cases.</span></span></span></p></div>","PeriodicalId":52228,"journal":{"name":"Online Social Networks and Media","volume":"29 ","pages":"Article 100199"},"PeriodicalIF":0.0,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126927757","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-05-01DOI: 10.1016/j.osnem.2022.100203
Nicolás E. Díaz Ferreyra , Tobias Hecking , Esma Aïmeur , Maritta Heisel , H. Ulrich Hoppe
Access-Control Lists (ACLs) (a.k.a. “friend lists”) are one of the most important privacy features of Online Social Networks (OSNs) as they allow users to restrict the audience of their publications. Nevertheless, creating and maintaining custom ACLs can introduce a high cognitive burden on average OSNs users since it normally requires assessing the trustworthiness of a large number of contacts. In principle, community detection algorithms can be leveraged to support the generation of ACLs by mapping a set of examples (i.e. contacts labelled as “untrusted”) to the emerging communities inside the user’s ego-network. However, unlike users’ access-control preferences, traditional community-detection algorithms do not take the homophily characteristics of such communities into account (i.e. attributes shared among members). Consequently, this strategy may lead to inaccurate ACL configurations and privacy breaches under certain homophily scenarios. This work investigates the use of community-detection algorithms for the automatic generation of ACLs in OSNs. Particularly, it analyses the performance of the aforementioned approach under different homophily conditions through a simulation model. Furthermore, since private information may reach the scope of untrusted recipients through the re-sharing affordances of OSNs, information diffusion processes are also modelled and taken explicitly into account. Altogether, the removal of gatekeeper nodes is further explored as a strategy to counteract unwanted data dissemination.
{"title":"Community detection for access-control decisions: Analysing the role of homophily and information diffusion in Online Social Networks","authors":"Nicolás E. Díaz Ferreyra , Tobias Hecking , Esma Aïmeur , Maritta Heisel , H. Ulrich Hoppe","doi":"10.1016/j.osnem.2022.100203","DOIUrl":"10.1016/j.osnem.2022.100203","url":null,"abstract":"<div><p>Access-Control Lists (ACLs) (a.k.a. “friend lists”) are one of the most important privacy features of Online Social Networks (OSNs) as they allow users to restrict the audience of their publications. Nevertheless, creating and maintaining custom ACLs can introduce a high cognitive burden on average OSNs users since it normally requires assessing the trustworthiness of a large number of contacts. In principle, community detection algorithms can be leveraged to support the generation of ACLs by mapping a set of examples (i.e. contacts labelled as “untrusted”) to the emerging communities inside the user’s ego-network. However, unlike users’ access-control preferences, traditional community-detection algorithms do not take the <em>homophily</em> characteristics of such communities into account (i.e. attributes shared among members). Consequently, this strategy may lead to inaccurate ACL configurations and privacy breaches under certain homophily scenarios. This work investigates the use of community-detection algorithms for the automatic generation of ACLs in OSNs. Particularly, it analyses the performance of the aforementioned approach under different homophily conditions through a simulation model. Furthermore, since private information may reach the scope of untrusted recipients through the re-sharing affordances of OSNs, information diffusion processes are also modelled and taken explicitly into account. Altogether, the removal of gatekeeper nodes is further explored as a strategy to counteract unwanted data dissemination.</p></div>","PeriodicalId":52228,"journal":{"name":"Online Social Networks and Media","volume":"29 ","pages":"Article 100203"},"PeriodicalIF":0.0,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2468696422000076/pdfft?md5=75c4fc7d96a2eb8f7b982d6070762c80&pid=1-s2.0-S2468696422000076-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129427434","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-05-01DOI: 10.1016/j.osnem.2022.100200
Alexandre Magno Sousa , Jussara M. Almeida , Flavio Figueiredo
A number of recent studies have explicitly introduced curiosity models into the analysis of online information consumption, most notably in the design of recommendation systems. However, most prior efforts have neglected the role of social influence as a component of the curiosity stimulation process, which has been referred to as social curiosity. In this paper, we propose a number of metrics to quantify social curiosity applying them to WhatsApp, a widely used communication platform. We show that our metrics capture aspects that are complementary to other variables priorly related to curiosity stimulation and use them to offer a broad characterization of user curiosity as a driving force behind communication in WhatsApp.
{"title":"Metrics of social curiosity: The WhatsApp case","authors":"Alexandre Magno Sousa , Jussara M. Almeida , Flavio Figueiredo","doi":"10.1016/j.osnem.2022.100200","DOIUrl":"10.1016/j.osnem.2022.100200","url":null,"abstract":"<div><p><span>A number of recent studies have explicitly introduced curiosity models into the analysis of online information consumption, most notably in the design of recommendation systems. However, most prior efforts have neglected the role of social influence as a component of the curiosity stimulation process, which has been referred to as </span><em>social curiosity</em>. In this paper, we propose a number of metrics to quantify social curiosity applying them to WhatsApp, a widely used communication platform. We show that our metrics capture aspects that are complementary to other variables priorly related to curiosity stimulation and use them to offer a broad characterization of user curiosity as a driving force behind communication in WhatsApp.</p></div>","PeriodicalId":52228,"journal":{"name":"Online Social Networks and Media","volume":"29 ","pages":"Article 100200"},"PeriodicalIF":0.0,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129796388","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-05-01DOI: 10.1016/j.osnem.2022.100204
Lucas E.B. Skora , Helen C.M. Senefonte , Myriam Regattieri Delgado , Ricardo Lüders , Thiago H. Silva
A better understanding of the behavior of tourists is strategic for improving services in the competitive and important economic segment of global tourism. Critical studies in the literature often explore the issue using traditional data, such as questionnaires or interviews. Traditional approaches provide precious information; however, they impose challenges to obtaining large-scale data, making it hard to study worldwide patterns. Location-based social networks (LBSNs) can potentially mitigate such issues due to the relatively low cost of acquiring large amounts of behavioral data. Nevertheless, before using such data for studying tourists’ behavior, it is necessary to verify whether the information adequately reveals the behavior measured with traditional data — considered the ground truth. Thus, the present work investigates in which countries the global tourism network measured with an LBSN agreeably reflects the behavior estimated by the World Tourism Organization using traditional methods. Although we could find exceptions, the results suggest that, for most countries, LBSN data can satisfactorily represent the behavior studied. We have an indication that, in countries with high correlations between results obtained from both datasets, LBSN data can be used in research regarding the mobility of the tourists in the studied context.
{"title":"Comparing global tourism flows measured by official census and social sensing","authors":"Lucas E.B. Skora , Helen C.M. Senefonte , Myriam Regattieri Delgado , Ricardo Lüders , Thiago H. Silva","doi":"10.1016/j.osnem.2022.100204","DOIUrl":"https://doi.org/10.1016/j.osnem.2022.100204","url":null,"abstract":"<div><p>A better understanding of the behavior of tourists is strategic for improving services in the competitive and important economic segment of global tourism. Critical studies in the literature often explore the issue using traditional data, such as questionnaires or interviews. Traditional approaches provide precious information; however, they impose challenges to obtaining large-scale data, making it hard to study worldwide patterns. Location-based social networks (LBSNs) can potentially mitigate such issues due to the relatively low cost of acquiring large amounts of behavioral data. Nevertheless, before using such data for studying tourists’ behavior, it is necessary to verify whether the information adequately reveals the behavior measured with traditional data — considered the ground truth. Thus, the present work investigates in which countries the global tourism network measured with an LBSN agreeably reflects the behavior estimated by the World Tourism Organization using traditional methods. Although we could find exceptions, the results suggest that, for most countries, LBSN data can satisfactorily represent the behavior studied. We have an indication that, in countries with high correlations between results obtained from both datasets, LBSN data can be used in research regarding the mobility of the tourists in the studied context.</p></div>","PeriodicalId":52228,"journal":{"name":"Online Social Networks and Media","volume":"29 ","pages":"Article 100204"},"PeriodicalIF":0.0,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"137156824","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-05-01DOI: 10.1016/j.osnem.2022.100205
Zhixue Zhao, Ziqi Zhang, Frank Hopfgartner
Toxic comment classification models are often found biased towards identity terms, i.e., terms characterizing a specific group of people such as “Muslim” and “black”. Such bias is commonly reflected in false positive predictions, i.e., non-toxic comments with identity terms. In this work, we propose a novel approach to debias the model in toxic comment classification, leveraging the notion of subjectivity level of a comment and the presence of identity terms. We hypothesize that toxic comments containing identity terms are more likely to be expressions of subjective feelings or opinions. Therefore, the subjectivity level of a comment containing identity terms can be helpful for classifying toxic comments and mitigating the identity term bias. To implement this idea, we propose a model based on BERT and study two different methods of measuring the subjectivity level. The first method uses a lexicon-based tool. The second method is based on the idea of calculating the embedding similarity between a comment and a relevant Wikipedia text of the identity term in the comment. We thoroughly evaluate our method on an extensive collection of four datasets collected from different social media platforms. Our results show that: (1) our models that incorporate both features of subjectivity and identity terms consistently outperform strong SOTA baselines, with our best performing model achieving an improvement in F1 of 4.75% over a Twitter dataset; (2) our idea of measuring subjectivity based on the similarity to the relevant Wikipedia text is very effective on toxic comment classification as our model using this has achieved the best performance on 3 out of 4 datasets while obtaining comparative performance on the remaining dataset. We further test our method on RoBERTa to evaluate the generality of our method and the results show the biggest improvement in F1 of up to 1.29% (on a dataset from a white supremacist online forum).
{"title":"Utilizing subjectivity level to mitigate identity term bias in toxic comments classification","authors":"Zhixue Zhao, Ziqi Zhang, Frank Hopfgartner","doi":"10.1016/j.osnem.2022.100205","DOIUrl":"10.1016/j.osnem.2022.100205","url":null,"abstract":"<div><p><span><span><span>Toxic comment classification models are often found biased towards identity terms, i.e., terms characterizing a specific group of people such as “Muslim” and “black”. Such bias is commonly reflected in </span>false positive predictions, i.e., non-toxic comments with identity terms. In this work, we propose a novel approach to debias the model in toxic comment classification, leveraging the notion of subjectivity level of a comment and the presence of identity terms. We hypothesize that toxic comments containing identity terms are more likely to be expressions of subjective feelings or opinions. Therefore, the subjectivity level of a comment containing identity terms can be helpful for classifying toxic comments and mitigating the identity term bias. To implement this idea, we propose a model based on </span>BERT and study two different methods of measuring the subjectivity level. The first method uses a lexicon-based tool. The second method is based on the idea of calculating the embedding similarity between a comment and a relevant Wikipedia text of the identity term in the comment. We thoroughly evaluate our method on an extensive collection of four datasets collected from different </span>social media platforms<span>. Our results show that: (1) our models that incorporate both features of subjectivity and identity terms consistently outperform strong SOTA baselines, with our best performing model achieving an improvement in F1 of 4.75% over a Twitter dataset; (2) our idea of measuring subjectivity based on the similarity to the relevant Wikipedia text is very effective on toxic comment classification as our model using this has achieved the best performance on 3 out of 4 datasets while obtaining comparative performance on the remaining dataset. We further test our method on RoBERTa to evaluate the generality of our method and the results show the biggest improvement in F1 of up to 1.29% (on a dataset from a white supremacist online forum).</span></p></div>","PeriodicalId":52228,"journal":{"name":"Online Social Networks and Media","volume":"29 ","pages":"Article 100205"},"PeriodicalIF":0.0,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117258021","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-05-01DOI: 10.1016/j.osnem.2022.100209
Usman Anjum, Vladimir Zadorozhny, Prashant Krishnamurthy
Event localization is the task of finding the location of an event. Commonly, event localization using microblogging services, like Twitter, use con- tents of the messages and the geographical information associated with the messages. In this paper, we propose a novel approach called SPARE (SPAtial REconstruction) that bypasses the need for geographical or semantic information to localize tweets. We assume there are reference coordinates at known locations that scrape the microblog (tweet) counts in time and space (circular regions around the reference coordinate). The counts of tweets are aggregated which are then disaggregated to identify event patterns. The change in counts of tweets would be indicative of an event pattern. We show, using real data, that the change in counts of tweets is manifested as peaks. The peaks from multiple reference coordinates can be used as an input to trilateration techniques to pinpoint the location of an event. We introduce metrics to identify the quality of disaggregation of fine-grained data and examine techniques like filtering to improve accuracy of event location. The experimental results show that our method can identify the location of an event with high accuracy.
{"title":"Localization of Unidentified Events with Raw Microblogging Data","authors":"Usman Anjum, Vladimir Zadorozhny, Prashant Krishnamurthy","doi":"10.1016/j.osnem.2022.100209","DOIUrl":"10.1016/j.osnem.2022.100209","url":null,"abstract":"<div><p><span><span>Event localization is the task of finding the location of an event. Commonly, event localization using microblogging services, like Twitter, use con- tents of the messages and the </span>geographical information<span> associated with the messages. In this paper, we propose a novel approach called SPARE (SPAtial REconstruction) that bypasses the need for geographical or semantic information to localize tweets. We assume there are reference coordinates at known locations that scrape the microblog (tweet) counts in time and space (circular regions around the reference coordinate). The counts of tweets are aggregated which are then disaggregated to identify event patterns. The change in counts of tweets would be indicative of an event pattern. We show, using real data, that the change in counts of tweets is manifested as peaks. The peaks from multiple reference coordinates can be used as an input to </span></span>trilateration techniques to pinpoint the location of an event. We introduce metrics to identify the quality of disaggregation of fine-grained data and examine techniques like filtering to improve accuracy of event location. The experimental results show that our method can identify the location of an event with high accuracy.</p></div>","PeriodicalId":52228,"journal":{"name":"Online Social Networks and Media","volume":"29 ","pages":"Article 100209"},"PeriodicalIF":0.0,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128221538","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-05-01DOI: 10.1016/j.osnem.2022.100206
Ahmad Zareie, Rizos Sakellariou
The spread of rumours in social networks has become a significant challenge in recent years. Blocking so-called critical edges, that is, edges that have a significant role in the spreading process, has attracted lots of attention as a means to minimize the spread of rumours. Although the detection of the sources of rumour may help identify critical edges this has an overhead that source-ignorant approaches are trying to eliminate. Several source-ignorant edge blocking methods have been proposed which mostly determine critical edges on the basis of centrality. Taking into account additional features of edges (beyond centrality) may help determine what edges to block more accurately. In this paper, a new source-ignorant method is proposed to identify a set of critical edges by considering for each edge the impact of blocking and the influence of the nodes connected to the edge. Experimental results demonstrate that the proposed method can identify critical edges more accurately in comparison to other source-ignorant methods.
{"title":"Rumour spread minimization in social networks: A source-ignorant approach","authors":"Ahmad Zareie, Rizos Sakellariou","doi":"10.1016/j.osnem.2022.100206","DOIUrl":"10.1016/j.osnem.2022.100206","url":null,"abstract":"<div><p>The spread of rumours in social networks has become a significant challenge in recent years. Blocking so-called critical edges, that is, edges that have a significant role in the spreading process, has attracted lots of attention as a means to minimize the spread of rumours. Although the detection of the sources of rumour may help identify critical edges this has an overhead that source-ignorant approaches are trying to eliminate. Several source-ignorant edge blocking methods have been proposed which mostly determine critical edges on the basis of centrality. Taking into account additional features of edges (beyond centrality) may help determine what edges to block more accurately. In this paper, a new source-ignorant method is proposed to identify a set of critical edges by considering for each edge the impact of blocking and the influence of the nodes connected to the edge. Experimental results demonstrate that the proposed method can identify critical edges more accurately in comparison to other source-ignorant methods.</p></div>","PeriodicalId":52228,"journal":{"name":"Online Social Networks and Media","volume":"29 ","pages":"Article 100206"},"PeriodicalIF":0.0,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2468696422000106/pdfft?md5=5c46e8ade686686c561918b3c01408b9&pid=1-s2.0-S2468696422000106-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130196186","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-03-01DOI: 10.1016/j.osnem.2021.100194
Rafael M.O. Cruz , Woshington V. de Sousa , George D.C. Cavalcanti
Hate speech is a major issue in social networks due to the high volume of data generated daily. Recent works demonstrate the usefulness of machine learning (ML) in dealing with the nuances required to distinguish between hateful posts from just sarcasm or offensive language. Many ML solutions for hate speech detection have been proposed by either changing how features are extracted from the text or the classification algorithm employed. However, most works consider only one type of feature extraction and classification algorithm. This work argues that a combination of multiple feature extraction techniques and different classification models is needed. We propose a framework to analyze the relationship between multiple feature extraction and classification techniques to understand how they complement each other. The framework is used to select a subset of complementary techniques to compose a robust multiple classifiers system (MCS) for hate speech detection. The experimental study considering four hate speech classification datasets demonstrates that the proposed framework is a promising methodology for analyzing and designing high-performing MCS for this task. MCS system obtained using the proposed framework significantly outperforms the combination of all models and the homogeneous and heterogeneous selection heuristics, demonstrating the importance of having a proper selection scheme. Source code, figures and dataset splits can be found in the GitHub repository: https://github.com/Menelau/Hate-Speech-MCS.
{"title":"Selecting and combining complementary feature representations and classifiers for hate speech detection","authors":"Rafael M.O. Cruz , Woshington V. de Sousa , George D.C. Cavalcanti","doi":"10.1016/j.osnem.2021.100194","DOIUrl":"https://doi.org/10.1016/j.osnem.2021.100194","url":null,"abstract":"<div><p><span><span>Hate speech is a major issue in social networks due to the high volume of data generated daily. Recent works demonstrate the usefulness of machine learning (ML) in dealing with the nuances required to distinguish between hateful posts from just sarcasm or offensive language. Many ML solutions for hate speech detection have been proposed by either changing how features are extracted from the text or the </span>classification algorithm<span><span><span> employed. However, most works consider only one type of feature extraction and classification algorithm. This work argues that a combination of multiple feature extraction techniques and different classification models is needed. We propose a framework to analyze the relationship between multiple feature extraction and </span>classification techniques to understand how they complement each other. The framework is used to select a subset of complementary techniques to compose a robust </span>multiple classifiers system<span> (MCS) for hate speech detection. The experimental study considering four hate speech classification datasets demonstrates that the proposed framework is a promising methodology for analyzing and designing high-performing MCS for this task. MCS system obtained using the proposed framework significantly outperforms the combination of all models and the homogeneous and heterogeneous selection heuristics, demonstrating the importance of having a proper selection scheme. Source code, figures and dataset splits can be found in the GitHub repository: </span></span></span><span>https://github.com/Menelau/Hate-Speech-MCS</span><svg><path></path></svg>.</p></div>","PeriodicalId":52228,"journal":{"name":"Online Social Networks and Media","volume":"28 ","pages":"Article 100194"},"PeriodicalIF":0.0,"publicationDate":"2022-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91737144","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}