首页 > 最新文献

Online Social Networks and Media最新文献

英文 中文
Hidden behind the obvious: Misleading keywords and implicitly abusive language on social media 显而易见的背后隐藏着:社交媒体上误导性的关键词和含蓄的辱骂性语言
Q1 Social Sciences Pub Date : 2022-07-01 DOI: 10.1016/j.osnem.2022.100210
Wenjie Yin, Arkaitz Zubiaga

While social media offers freedom of self-expression, abusive language carry significant negative social impact. Driven by the importance of the issue, research in the automated detection of abusive language has witnessed growth and improvement. However, these detection models display a reliance on strongly indicative keywords, such as slurs and profanity. This means that they can falsely (1a) miss abuse without such keywords or (1b) flag non-abuse with such keywords, and that (2) they perform poorly on unseen data. Despite the recognition of these problems, gaps and inconsistencies remain in the literature. In this study, we analyse the impact of keywords from dataset construction to model behaviour in detail, with a focus on how models make mistakes on (1a) and (1b), and how (1a) and (1b) interact with (2). Through the analysis, we provide suggestions for future research to address all three problems.

虽然社交媒体提供了自我表达的自由,但辱骂性语言会带来严重的负面社会影响。在这个问题的重要性的推动下,对辱骂性语言的自动检测的研究得到了发展和改进。然而,这些检测模型显示了对强烈指示性关键字的依赖,如诽谤和亵渎。这意味着它们可能会错误地(1a)忽略没有这些关键字的滥用,或(1b)标记有这些关键字的非滥用,以及(2)它们在看不见的数据上表现不佳。尽管认识到了这些问题,但文献中仍然存在差距和不一致之处。在本研究中,我们详细分析了从数据集构建到模型行为的关键字的影响,重点关注模型如何在(1a)和(1b)上犯错误,以及(1a)和(1b)如何与(2)相互作用。通过分析,我们为未来的研究提供了解决这三个问题的建议。
{"title":"Hidden behind the obvious: Misleading keywords and implicitly abusive language on social media","authors":"Wenjie Yin,&nbsp;Arkaitz Zubiaga","doi":"10.1016/j.osnem.2022.100210","DOIUrl":"https://doi.org/10.1016/j.osnem.2022.100210","url":null,"abstract":"<div><p>While social media offers freedom of self-expression, abusive language carry significant negative social impact. Driven by the importance of the issue, research in the automated detection of abusive language has witnessed growth and improvement. However, these detection models display a reliance on strongly indicative keywords, such as slurs and profanity. This means that they can falsely (1a) miss abuse without such keywords or (1b) flag non-abuse with such keywords, and that (2) they perform poorly on unseen data. Despite the recognition of these problems, gaps and inconsistencies remain in the literature. In this study, we analyse the impact of keywords from dataset construction to model behaviour in detail, with a focus on how models make mistakes on (1a) and (1b), and how (1a) and (1b) interact with (2). Through the analysis, we provide suggestions for future research to address all three problems.</p></div>","PeriodicalId":52228,"journal":{"name":"Online Social Networks and Media","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2468696422000143/pdfft?md5=ee7d87179b98cdab8269c5284ee10fcf&pid=1-s2.0-S2468696422000143-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"137054072","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A social network of crime: A review of the use of social networks for crime and the detection of crime 犯罪的社会网络:社会网络在犯罪和犯罪侦查中的应用综述
Q1 Social Sciences Pub Date : 2022-07-01 DOI: 10.1016/j.osnem.2022.100211
Brett Drury , Samuel Morais Drury , Md Arafatur Rahman , Ihsan Ullah

Social media is used to commit and detect crimes. With automated methods, it is possible to scale both crime and detection of crime to a large number of people. The ability of criminals to reach large numbers of people has made this area subject to frequent study, and consequently, there have been several surveys that have reviewed specific crimes committed on social platforms. Until now, there has not been a review article that considers all types of crimes on social media, their similarity as well as their detection. The demonstration of similarity between crimes and their detection methods allows for the transfer of techniques and data between domains. This survey, therefore, seeks to document the crimes that have been committed on social media, and demonstrate their similarity through a taxonomy of crimes. Also, this survey documents publicly available datasets. Finally, this survey provides suggestions for further research in this field.

社交媒体被用来实施和侦查犯罪。有了自动化的方法,可以将犯罪和犯罪侦查规模扩大到大量的人。犯罪分子接触大量人群的能力使这一领域成为频繁研究的对象,因此,有几项调查审查了在社交平台上犯下的具体罪行。到目前为止,还没有一篇评论文章考虑到社交媒体上所有类型的犯罪,它们的相似性以及它们的检测。证明犯罪及其侦查方法之间的相似性,可以在不同领域之间转让技术和数据。因此,这项调查旨在记录在社交媒体上犯下的罪行,并通过对犯罪的分类来证明它们的相似性。此外,本调查记录了公开可用的数据集。最后,本文对该领域的进一步研究提出了建议。
{"title":"A social network of crime: A review of the use of social networks for crime and the detection of crime","authors":"Brett Drury ,&nbsp;Samuel Morais Drury ,&nbsp;Md Arafatur Rahman ,&nbsp;Ihsan Ullah","doi":"10.1016/j.osnem.2022.100211","DOIUrl":"10.1016/j.osnem.2022.100211","url":null,"abstract":"<div><p>Social media is used to commit and detect crimes. With automated methods, it is possible to scale both crime and detection of crime to a large number of people. The ability of criminals to reach large numbers of people has made this area subject to frequent study, and consequently, there have been several surveys that have reviewed specific crimes committed on social platforms. Until now, there has not been a review article that considers all types of crimes on social media, their similarity as well as their detection. The demonstration of similarity between crimes and their detection methods allows for the transfer of techniques and data between domains. This survey, therefore, seeks to document the crimes that have been committed on social media, and demonstrate their similarity through a taxonomy of crimes. Also, this survey documents publicly available datasets. Finally, this survey provides suggestions for further research in this field.</p></div>","PeriodicalId":52228,"journal":{"name":"Online Social Networks and Media","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2468696422000155/pdfft?md5=20a07ef98209445e0d56492856150415&pid=1-s2.0-S2468696422000155-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117257444","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
A Survey on Political Viewpoints Identification 政治观点识别调查
Q1 Social Sciences Pub Date : 2022-07-01 DOI: 10.1016/j.osnem.2022.100208
Tu My Doan, Jon Atle Gulla

Political viewpoints identification (PVI) is a task in Natural Language Processing that takes political texts and recognizes the writer’s opinions towards a political matter. PVI reduces the ambiguity in texts by identifying the underlying meaning and clarifying the bias margin along the political spectrum (bias leaning). Thus, even non-experts can better understand political texts. For instance, they can identify misinformation, bias, and hidden political agendas. In this paper, we formally define the concept of political viewpoints identification, explain its importance and discuss to what extent current techniques can be used for extracting political views from text. Existing techniques address the problem of PVI inadequately. We outline their deficiencies and present a research agenda to advance PVI.

政治观点识别(PVI)是自然语言处理中的一项任务,它提取政治文本并识别作者对政治问题的观点。PVI通过识别潜在意义和澄清政治光谱上的偏见边际(偏见倾向)来减少文本中的歧义。因此,即使是非专家也能更好地理解政治文本。例如,它们可以识别错误信息、偏见和隐藏的政治议程。在本文中,我们正式定义了政治观点识别的概念,解释了它的重要性,并讨论了当前技术在多大程度上可以用于从文本中提取政治观点。现有的技术不足以解决PVI的问题。我们概述了它们的不足之处,并提出了推进PVI的研究议程。
{"title":"A Survey on Political Viewpoints Identification","authors":"Tu My Doan,&nbsp;Jon Atle Gulla","doi":"10.1016/j.osnem.2022.100208","DOIUrl":"10.1016/j.osnem.2022.100208","url":null,"abstract":"<div><p>Political viewpoints identification (PVI) is a task in Natural Language Processing that takes political texts and recognizes the writer’s opinions towards a political matter. PVI reduces the ambiguity in texts by identifying the underlying meaning and clarifying the bias margin along the political spectrum (bias leaning). Thus, even non-experts can better understand political texts. For instance, they can identify misinformation, bias, and hidden political agendas. In this paper, we formally define the concept of political viewpoints identification, explain its importance and discuss to what extent current techniques can be used for extracting political views from text. Existing techniques address the problem of PVI inadequately. We outline their deficiencies and present a research agenda to advance PVI.</p></div>","PeriodicalId":52228,"journal":{"name":"Online Social Networks and Media","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S246869642200012X/pdfft?md5=bd321c6c5936cd74474205188aafd644&pid=1-s2.0-S246869642200012X-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114589922","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Information consumption and boundary spanning in Decentralized Online Social Networks: The case of Mastodon users 分散在线社交网络中的信息消费与边界跨越:以乳齿象用户为例
Q1 Social Sciences Pub Date : 2022-07-01 DOI: 10.1016/j.osnem.2022.100220
Lucio La Cava, Sergio Greco, Andrea Tagarelli

Decentralized Online Social Networks (DOSNs) represent a growing trend in the social media landscape, as opposed to the well-known centralized peers, which are often in the spotlight due to privacy concerns and a vision typically focused on monetization through user relationships. By exploiting open-source software, DOSNs allow users to create their own servers, or instances, thus favoring the proliferation of platforms that are independent yet interconnected with each other in a transparent way. Nonetheless, the resulting cooperation model, commonly known as the Fediverse, still represents a world to be fully discovered, since existing studies have mainly focused on a limited number of structural aspects of interest in DOSNs.

In this work, we aim to fill a lack of study on user relations and roles in DOSNs, by taking two main actions: understanding the impact of decentralization on how users relate to each other within their membership instance and/or across different instances, and unveiling user roles that can explain two interrelated axes of social behavioral phenomena, namely information consumption and boundary spanning. To this purpose, we build our analysis on user networks from Mastodon, since it represents the most widely used DOSN platform. We believe that the findings drawn from our study on Mastodon users’ roles and information flow can pave a way for further development of fascinating research on DOSNs.

分散式在线社交网络(Decentralized Online Social Networks,简称dosn)代表了社交媒体领域的一种增长趋势,与众所周知的中心化社交网络相反,中心化社交网络往往因为隐私问题和通过用户关系实现盈利的愿景而受到关注。通过利用开源软件,dosn允许用户创建自己的服务器或实例,从而有利于以透明的方式相互连接的独立平台的扩散。尽管如此,由此产生的合作模式,通常被称为Fediverse,仍然代表着一个有待充分发现的世界,因为现有的研究主要集中在对dosn感兴趣的有限数量的结构方面。在这项工作中,我们的目标是通过采取两项主要行动来填补dosn中用户关系和角色研究的不足:理解去中心化对用户在其成员实例内和/或跨不同实例之间如何相互关联的影响,并揭示可以解释两个相互关联的社会行为现象轴的用户角色,即信息消费和边界跨越。为此,我们在Mastodon的用户网络上进行分析,因为它代表了最广泛使用的DOSN平台。我们相信从乳齿象用户的角色和信息流的研究中得出的发现可以为进一步发展令人着迷的dosn研究铺平道路。
{"title":"Information consumption and boundary spanning in Decentralized Online Social Networks: The case of Mastodon users","authors":"Lucio La Cava,&nbsp;Sergio Greco,&nbsp;Andrea Tagarelli","doi":"10.1016/j.osnem.2022.100220","DOIUrl":"https://doi.org/10.1016/j.osnem.2022.100220","url":null,"abstract":"<div><p>Decentralized Online Social Networks<span><span> (DOSNs) represent a growing trend in the social media landscape, as opposed to the well-known centralized peers, which are often in the spotlight due to privacy concerns and a vision typically focused on monetization through user relationships. By exploiting open-source software, DOSNs allow users to create their own servers, or instances, thus favoring the proliferation of platforms that are independent yet interconnected with each other in a transparent way. Nonetheless, the resulting </span>cooperation model, commonly known as the Fediverse, still represents a world to be fully discovered, since existing studies have mainly focused on a limited number of structural aspects of interest in DOSNs.</span></p><p>In this work, we aim to fill a lack of study on user relations and roles in DOSNs, by taking two main actions: understanding the impact of decentralization on how users relate to each other within their membership instance and/or across different instances, and unveiling user roles that can explain two interrelated axes of social behavioral phenomena, namely information consumption and boundary spanning. To this purpose, we build our analysis on user networks from Mastodon, since it represents the most widely used DOSN platform. We believe that the findings drawn from our study on Mastodon users’ roles and information flow can pave a way for further development of fascinating research on DOSNs.</p></div>","PeriodicalId":52228,"journal":{"name":"Online Social Networks and Media","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91623858","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Journalists’ ego networks in Twitter: Invariant and distinctive structural features 记者在Twitter上的自我网络:不变的和独特的结构特征
Q1 Social Sciences Pub Date : 2022-07-01 DOI: 10.1016/j.osnem.2022.100207
Mustafa Toprak, Chiara Boldrini, Andrea Passarella, Marco Conti

Ego networks have proved to be a valuable tool for understanding the relationships that individuals establish with their peers, both in offline and online social networks. Particularly interesting are the cognitive constraints associated with the interactions between the ego and the members of their ego network, which limit individuals to maintain meaningful interactions with no more than 150 people, on average, and to arrange such relationships along concentric circles of decreasing engagement. In this work, we focus on the ego networks of journalists on Twitter, considering 17 different countries, and we investigate whether they feature the same characteristics observed for other relevant classes of Twitter users, like politicians and generic users. Our findings are that journalists are generally more active and interact with more people than generic users, regardless of their country. Their ego network structure is very aligned with reference models derived in anthropology and observed in general human ego networks. Remarkably, the similarity is even higher than the one of politicians and generic users ego networks. This may imply a greater cognitive involvement with Twitter for journalists than for other user categories. From a dynamic perspective, journalists have stable short-term relationships that do not change much over time. In the longer term, though, ego networks can be pretty dynamic, especially in the innermost circles. Moreover, the ego-alter ties of journalists are often information-driven, as they are mediated by hashtags both at their inception and during their lifetime. Finally, we found that relationships between journalists are assortative in popularity: journalists tend to engage with other journalists of similar popularity, in all layers but especially in their innermost ones. Instead, when journalists interact with generic users, this assortativity is only present in the innermost layers.

自我网络已被证明是一个有价值的工具,用于理解个人与同伴建立的关系,无论是在线下还是在线社交网络中。特别有趣的是与自我和自我网络成员之间的互动相关的认知约束,这限制了个体与平均不超过150人保持有意义的互动,并沿着减少参与度的同心圆安排这种关系。在这项工作中,我们关注17个不同国家的推特记者的自我网络,并调查他们是否具有其他相关类别的推特用户(如政治家和普通用户)所观察到的相同特征。我们的发现是,记者通常比普通用户更活跃,与更多的人互动,无论他们来自哪个国家。他们的自我网络结构与人类学中衍生的参考模型非常一致,并在一般的人类自我网络中观察到。值得注意的是,这种相似性甚至高于政治家和普通用户自我网络的相似性。这可能意味着,与其他用户类别相比,记者对Twitter的认知参与程度更高。从动态的角度来看,记者有稳定的短期关系,随着时间的推移不会发生太大变化。然而,从长远来看,自我网络可能是非常动态的,尤其是在最内部的圈子里。此外,记者的自我改变关系往往是由信息驱动的,因为他们在一开始和一生中都受到话题标签的调节。最后,我们发现记者之间的关系在受欢迎程度上是分类的:记者倾向于与其他受欢迎程度相似的记者交往,在所有层面上,尤其是在他们最内在的层面上。相反,当记者与普通用户互动时,这种分类性只存在于最内层。
{"title":"Journalists’ ego networks in Twitter: Invariant and distinctive structural features","authors":"Mustafa Toprak,&nbsp;Chiara Boldrini,&nbsp;Andrea Passarella,&nbsp;Marco Conti","doi":"10.1016/j.osnem.2022.100207","DOIUrl":"10.1016/j.osnem.2022.100207","url":null,"abstract":"<div><p><span>Ego networks have proved to be a valuable tool for understanding the relationships that individuals establish with their peers, both in offline and online social networks. Particularly interesting are the </span><em>cognitive constraints</em><span> associated with the interactions between the ego and the members of their ego network, which limit individuals to maintain meaningful interactions with no more than 150 people, on average, and to arrange such relationships along concentric circles of decreasing engagement. In this work, we focus on the ego networks of journalists on Twitter, considering 17 different countries, and we investigate whether they feature the same characteristics observed for other relevant classes of Twitter users, like politicians and generic users. Our findings are that journalists are generally more active and interact with more people than generic users, regardless of their country. Their ego network structure is very aligned with reference models derived in anthropology and observed in general human ego networks. Remarkably, the similarity is even higher than the one of politicians and generic users ego networks. This may imply a greater cognitive involvement with Twitter for journalists than for other user categories. From a dynamic perspective, journalists have stable short-term relationships that do not change much over time. In the longer term, though, ego networks can be pretty dynamic, especially in the innermost circles. Moreover, the ego-alter ties of journalists are often information-driven, as they are mediated by hashtags both at their inception and during their lifetime. Finally, we found that relationships between journalists are assortative in popularity: journalists tend to engage with other journalists of similar popularity, in all layers but especially in their innermost ones. Instead, when journalists interact with generic users, this assortativity is only present in the innermost layers.</span></p></div>","PeriodicalId":52228,"journal":{"name":"Online Social Networks and Media","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125330410","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Balancing between holistic and cumulative sentiment classification 平衡整体和累积的情绪分类
Q1 Social Sciences Pub Date : 2022-05-01 DOI: 10.1016/j.osnem.2022.100199
Pantelis Agathangelou, Ioannis Katakis

Sentiment analysis is a fast-accelerating discipline that develops algorithms for knowledge discovery from opinionated content. The challenges however, when it comes to analyzing user reviews are plenty. Bad-quality, informal use of language and lack of labels, are only a few obstacles. Most importantly, users, consciously or subconsciously, use different approaches for expressing their opinion about a product or a service. Some of them go sentence by sentence mentioning some positive and negative aspects whereas others provide a mixed piece of text where the reader is supposed to see the big picture to understand the message. In this work, we propose a novel neural network that deals with both situations. Our method, by combining convolutional, recurrent and attention neural networks can extract rich linguistic patterns that reveal the user’s sentiment towards the entity under review. We evaluate our method in nine datasets that represent both binary and multi-class classification tasks. Experimental evaluation indicates that our method outperforms well-established deep learning approaches. Our approach outperformed the competitive methods in 8 out of 9 cases.

情感分析是一门快速发展的学科,它开发了从自以为是的内容中发现知识的算法。然而,当涉及到分析用户评论时,挑战是很多的。语言质量差、使用不正式以及缺乏标签,这些只是少数障碍。最重要的是,用户有意识或潜意识地使用不同的方法来表达他们对产品或服务的意见。其中一些是一句一句地提到积极和消极的方面,而另一些则提供了一个混合的文本,读者应该看到大局来理解信息。在这项工作中,我们提出了一种新的神经网络来处理这两种情况。我们的方法结合了卷积、循环和注意力神经网络,可以提取丰富的语言模式,揭示用户对所审查实体的情感。我们在代表二元和多类分类任务的9个数据集中评估了我们的方法。实验评估表明,我们的方法优于成熟的深度学习方法。我们的方法在9个病例中有8个优于竞争方法。
{"title":"Balancing between holistic and cumulative sentiment classification","authors":"Pantelis Agathangelou,&nbsp;Ioannis Katakis","doi":"10.1016/j.osnem.2022.100199","DOIUrl":"10.1016/j.osnem.2022.100199","url":null,"abstract":"<div><p>Sentiment analysis<span><span> is a fast-accelerating discipline that develops algorithms for knowledge discovery from opinionated content. The challenges however, when it comes to analyzing user reviews are plenty. Bad-quality, informal use of language and lack of labels, are only a few obstacles. Most importantly, users, consciously or subconsciously, use different approaches for expressing their opinion about a product or a service. Some of them go sentence by sentence mentioning some positive and negative aspects whereas others provide a mixed piece of text where the reader is supposed to see the big picture to understand the message. In this work, we propose a novel neural network that deals with both situations. Our method, by combining convolutional, </span>recurrent<span> and attention neural networks can extract rich linguistic patterns that reveal the user’s sentiment towards the entity under review. We evaluate our method in nine datasets that represent both binary and multi-class classification tasks<span>. Experimental evaluation indicates that our method outperforms well-established deep learning approaches. Our approach outperformed the competitive methods in 8 out of 9 cases.</span></span></span></p></div>","PeriodicalId":52228,"journal":{"name":"Online Social Networks and Media","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126927757","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Community detection for access-control decisions: Analysing the role of homophily and information diffusion in Online Social Networks 访问控制决策的社区检测:分析在线社交网络中同质性和信息扩散的作用
Q1 Social Sciences Pub Date : 2022-05-01 DOI: 10.1016/j.osnem.2022.100203
Nicolás E. Díaz Ferreyra , Tobias Hecking , Esma Aïmeur , Maritta Heisel , H. Ulrich Hoppe

Access-Control Lists (ACLs) (a.k.a. “friend lists”) are one of the most important privacy features of Online Social Networks (OSNs) as they allow users to restrict the audience of their publications. Nevertheless, creating and maintaining custom ACLs can introduce a high cognitive burden on average OSNs users since it normally requires assessing the trustworthiness of a large number of contacts. In principle, community detection algorithms can be leveraged to support the generation of ACLs by mapping a set of examples (i.e. contacts labelled as “untrusted”) to the emerging communities inside the user’s ego-network. However, unlike users’ access-control preferences, traditional community-detection algorithms do not take the homophily characteristics of such communities into account (i.e. attributes shared among members). Consequently, this strategy may lead to inaccurate ACL configurations and privacy breaches under certain homophily scenarios. This work investigates the use of community-detection algorithms for the automatic generation of ACLs in OSNs. Particularly, it analyses the performance of the aforementioned approach under different homophily conditions through a simulation model. Furthermore, since private information may reach the scope of untrusted recipients through the re-sharing affordances of OSNs, information diffusion processes are also modelled and taken explicitly into account. Altogether, the removal of gatekeeper nodes is further explored as a strategy to counteract unwanted data dissemination.

访问控制列表(acl)(又名“朋友列表”)是在线社交网络(OSNs)最重要的隐私功能之一,因为它们允许用户限制其出版物的受众。然而,创建和维护自定义acl可能会给普通osn用户带来很高的认知负担,因为它通常需要评估大量联系人的可信度。原则上,社区检测算法可以通过将一组示例(即标记为“不可信”的联系人)映射到用户自我网络中的新兴社区来支持acl的生成。然而,与用户的访问控制偏好不同,传统的社区检测算法没有考虑到这些社区的同质性特征(即成员之间共享的属性)。因此,在某些同质性场景下,这种策略可能导致不准确的ACL配置和隐私泄露。这项工作研究了在osn中自动生成acl的社区检测算法的使用。特别地,通过仿真模型分析了上述方法在不同同态条件下的性能。此外,由于私有信息可能通过osn的再共享功能到达不受信任的接收者的范围,因此还对信息扩散过程进行了建模并明确考虑。总之,我们进一步探讨了删除守门人节点作为一种策略来抵制不必要的数据传播。
{"title":"Community detection for access-control decisions: Analysing the role of homophily and information diffusion in Online Social Networks","authors":"Nicolás E. Díaz Ferreyra ,&nbsp;Tobias Hecking ,&nbsp;Esma Aïmeur ,&nbsp;Maritta Heisel ,&nbsp;H. Ulrich Hoppe","doi":"10.1016/j.osnem.2022.100203","DOIUrl":"10.1016/j.osnem.2022.100203","url":null,"abstract":"<div><p>Access-Control Lists (ACLs) (a.k.a. “friend lists”) are one of the most important privacy features of Online Social Networks (OSNs) as they allow users to restrict the audience of their publications. Nevertheless, creating and maintaining custom ACLs can introduce a high cognitive burden on average OSNs users since it normally requires assessing the trustworthiness of a large number of contacts. In principle, community detection algorithms can be leveraged to support the generation of ACLs by mapping a set of examples (i.e. contacts labelled as “untrusted”) to the emerging communities inside the user’s ego-network. However, unlike users’ access-control preferences, traditional community-detection algorithms do not take the <em>homophily</em> characteristics of such communities into account (i.e. attributes shared among members). Consequently, this strategy may lead to inaccurate ACL configurations and privacy breaches under certain homophily scenarios. This work investigates the use of community-detection algorithms for the automatic generation of ACLs in OSNs. Particularly, it analyses the performance of the aforementioned approach under different homophily conditions through a simulation model. Furthermore, since private information may reach the scope of untrusted recipients through the re-sharing affordances of OSNs, information diffusion processes are also modelled and taken explicitly into account. Altogether, the removal of gatekeeper nodes is further explored as a strategy to counteract unwanted data dissemination.</p></div>","PeriodicalId":52228,"journal":{"name":"Online Social Networks and Media","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2468696422000076/pdfft?md5=75c4fc7d96a2eb8f7b982d6070762c80&pid=1-s2.0-S2468696422000076-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129427434","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Metrics of social curiosity: The WhatsApp case 社交好奇心指标:WhatsApp案例
Q1 Social Sciences Pub Date : 2022-05-01 DOI: 10.1016/j.osnem.2022.100200
Alexandre Magno Sousa , Jussara M. Almeida , Flavio Figueiredo

A number of recent studies have explicitly introduced curiosity models into the analysis of online information consumption, most notably in the design of recommendation systems. However, most prior efforts have neglected the role of social influence as a component of the curiosity stimulation process, which has been referred to as social curiosity. In this paper, we propose a number of metrics to quantify social curiosity applying them to WhatsApp, a widely used communication platform. We show that our metrics capture aspects that are complementary to other variables priorly related to curiosity stimulation and use them to offer a broad characterization of user curiosity as a driving force behind communication in WhatsApp.

最近的一些研究明确地将好奇心模型引入到在线信息消费的分析中,尤其是在推荐系统的设计中。然而,大多数先前的努力都忽略了社会影响作为好奇心刺激过程的一个组成部分的作用,这被称为社会好奇心。在本文中,我们提出了一些指标来量化社交好奇心,并将其应用于WhatsApp(一个广泛使用的通信平台)。我们发现,我们的指标捕捉到了与其他先前与好奇心刺激相关的变量相补充的方面,并利用它们提供了用户好奇心作为WhatsApp交流背后驱动力的广泛特征。
{"title":"Metrics of social curiosity: The WhatsApp case","authors":"Alexandre Magno Sousa ,&nbsp;Jussara M. Almeida ,&nbsp;Flavio Figueiredo","doi":"10.1016/j.osnem.2022.100200","DOIUrl":"10.1016/j.osnem.2022.100200","url":null,"abstract":"<div><p><span>A number of recent studies have explicitly introduced curiosity models into the analysis of online information consumption, most notably in the design of recommendation systems. However, most prior efforts have neglected the role of social influence as a component of the curiosity stimulation process, which has been referred to as </span><em>social curiosity</em>. In this paper, we propose a number of metrics to quantify social curiosity applying them to WhatsApp, a widely used communication platform. We show that our metrics capture aspects that are complementary to other variables priorly related to curiosity stimulation and use them to offer a broad characterization of user curiosity as a driving force behind communication in WhatsApp.</p></div>","PeriodicalId":52228,"journal":{"name":"Online Social Networks and Media","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129796388","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Comparing global tourism flows measured by official census and social sensing 比较官方人口普查和社会感知测量的全球旅游流量
Q1 Social Sciences Pub Date : 2022-05-01 DOI: 10.1016/j.osnem.2022.100204
Lucas E.B. Skora , Helen C.M. Senefonte , Myriam Regattieri Delgado , Ricardo Lüders , Thiago H. Silva

A better understanding of the behavior of tourists is strategic for improving services in the competitive and important economic segment of global tourism. Critical studies in the literature often explore the issue using traditional data, such as questionnaires or interviews. Traditional approaches provide precious information; however, they impose challenges to obtaining large-scale data, making it hard to study worldwide patterns. Location-based social networks (LBSNs) can potentially mitigate such issues due to the relatively low cost of acquiring large amounts of behavioral data. Nevertheless, before using such data for studying tourists’ behavior, it is necessary to verify whether the information adequately reveals the behavior measured with traditional data — considered the ground truth. Thus, the present work investigates in which countries the global tourism network measured with an LBSN agreeably reflects the behavior estimated by the World Tourism Organization using traditional methods. Although we could find exceptions, the results suggest that, for most countries, LBSN data can satisfactorily represent the behavior studied. We have an indication that, in countries with high correlations between results obtained from both datasets, LBSN data can be used in research regarding the mobility of the tourists in the studied context.

更好地了解游客的行为对于改善全球旅游业竞争激烈和重要的经济部门的服务具有战略意义。文献中的批判性研究通常使用传统数据(如问卷调查或访谈)来探索这个问题。传统方法提供了宝贵的信息;然而,它们给获取大规模数据带来了挑战,使得研究全球模式变得困难。基于位置的社交网络(LBSNs)可以潜在地缓解这些问题,因为获取大量行为数据的成本相对较低。然而,在使用这些数据来研究游客的行为之前,有必要验证这些信息是否充分揭示了传统数据所测量的行为——考虑到基本事实。因此,本研究调查了在哪些国家,用LBSN测量的全球旅游网络能很好地反映世界旅游组织使用传统方法估计的行为。尽管我们可以发现例外,但结果表明,对于大多数国家,LBSN数据可以令人满意地代表所研究的行为。我们有一个迹象表明,在从两个数据集获得的结果之间具有高度相关性的国家,LBSN数据可以用于研究研究背景下的游客流动性。
{"title":"Comparing global tourism flows measured by official census and social sensing","authors":"Lucas E.B. Skora ,&nbsp;Helen C.M. Senefonte ,&nbsp;Myriam Regattieri Delgado ,&nbsp;Ricardo Lüders ,&nbsp;Thiago H. Silva","doi":"10.1016/j.osnem.2022.100204","DOIUrl":"https://doi.org/10.1016/j.osnem.2022.100204","url":null,"abstract":"<div><p>A better understanding of the behavior of tourists is strategic for improving services in the competitive and important economic segment of global tourism. Critical studies in the literature often explore the issue using traditional data, such as questionnaires or interviews. Traditional approaches provide precious information; however, they impose challenges to obtaining large-scale data, making it hard to study worldwide patterns. Location-based social networks (LBSNs) can potentially mitigate such issues due to the relatively low cost of acquiring large amounts of behavioral data. Nevertheless, before using such data for studying tourists’ behavior, it is necessary to verify whether the information adequately reveals the behavior measured with traditional data — considered the ground truth. Thus, the present work investigates in which countries the global tourism network measured with an LBSN agreeably reflects the behavior estimated by the World Tourism Organization using traditional methods. Although we could find exceptions, the results suggest that, for most countries, LBSN data can satisfactorily represent the behavior studied. We have an indication that, in countries with high correlations between results obtained from both datasets, LBSN data can be used in research regarding the mobility of the tourists in the studied context.</p></div>","PeriodicalId":52228,"journal":{"name":"Online Social Networks and Media","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"137156824","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Utilizing subjectivity level to mitigate identity term bias in toxic comments classification 利用主观性水平减轻有毒评论分类中的身份词偏差
Q1 Social Sciences Pub Date : 2022-05-01 DOI: 10.1016/j.osnem.2022.100205
Zhixue Zhao, Ziqi Zhang, Frank Hopfgartner

Toxic comment classification models are often found biased towards identity terms, i.e., terms characterizing a specific group of people such as “Muslim” and “black”. Such bias is commonly reflected in false positive predictions, i.e., non-toxic comments with identity terms. In this work, we propose a novel approach to debias the model in toxic comment classification, leveraging the notion of subjectivity level of a comment and the presence of identity terms. We hypothesize that toxic comments containing identity terms are more likely to be expressions of subjective feelings or opinions. Therefore, the subjectivity level of a comment containing identity terms can be helpful for classifying toxic comments and mitigating the identity term bias. To implement this idea, we propose a model based on BERT and study two different methods of measuring the subjectivity level. The first method uses a lexicon-based tool. The second method is based on the idea of calculating the embedding similarity between a comment and a relevant Wikipedia text of the identity term in the comment. We thoroughly evaluate our method on an extensive collection of four datasets collected from different social media platforms. Our results show that: (1) our models that incorporate both features of subjectivity and identity terms consistently outperform strong SOTA baselines, with our best performing model achieving an improvement in F1 of 4.75% over a Twitter dataset; (2) our idea of measuring subjectivity based on the similarity to the relevant Wikipedia text is very effective on toxic comment classification as our model using this has achieved the best performance on 3 out of 4 datasets while obtaining comparative performance on the remaining dataset. We further test our method on RoBERTa to evaluate the generality of our method and the results show the biggest improvement in F1 of up to 1.29% (on a dataset from a white supremacist online forum).

有毒评论分类模型经常被发现偏向于身份术语,即描述特定人群的术语,如“穆斯林”和“黑人”。这种偏见通常反映在假阳性预测中,即带有身份术语的无毒评论。在这项工作中,我们提出了一种新的方法来消除有毒评论分类模型的偏见,利用评论的主观性水平和身份术语的存在的概念。我们假设含有身份术语的有毒评论更有可能是主观感受或观点的表达。因此,包含身份术语的评论的主观性水平有助于对有毒评论进行分类,减轻身份术语偏见。为了实现这一思想,我们提出了一个基于BERT的模型,并研究了两种不同的主观水平测量方法。第一种方法使用基于词典的工具。第二种方法是基于计算评论和评论中标识词的相关维基百科文本之间的嵌入相似度的思想。我们在从不同的社交媒体平台收集的四个数据集的广泛收集上彻底评估了我们的方法。我们的研究结果表明:(1)我们的模型结合了主观性和身份术语的特征,始终优于强大的SOTA基线,与Twitter数据集相比,我们表现最好的模型的F1提高了4.75%;(2)我们基于与相关维基百科文本的相似度来衡量主观性的想法对有毒评论分类非常有效,因为我们使用的模型在4个数据集中的3个数据集上取得了最佳性能,同时在其余数据集上获得了比较性能。我们进一步在RoBERTa上测试了我们的方法,以评估我们方法的一般性,结果显示F1的最大改进高达1.29%(来自白人至上主义者在线论坛的数据集)。
{"title":"Utilizing subjectivity level to mitigate identity term bias in toxic comments classification","authors":"Zhixue Zhao,&nbsp;Ziqi Zhang,&nbsp;Frank Hopfgartner","doi":"10.1016/j.osnem.2022.100205","DOIUrl":"10.1016/j.osnem.2022.100205","url":null,"abstract":"<div><p><span><span><span>Toxic comment classification models are often found biased towards identity terms, i.e., terms characterizing a specific group of people such as “Muslim” and “black”. Such bias is commonly reflected in </span>false positive predictions, i.e., non-toxic comments with identity terms. In this work, we propose a novel approach to debias the model in toxic comment classification, leveraging the notion of subjectivity level of a comment and the presence of identity terms. We hypothesize that toxic comments containing identity terms are more likely to be expressions of subjective feelings or opinions. Therefore, the subjectivity level of a comment containing identity terms can be helpful for classifying toxic comments and mitigating the identity term bias. To implement this idea, we propose a model based on </span>BERT and study two different methods of measuring the subjectivity level. The first method uses a lexicon-based tool. The second method is based on the idea of calculating the embedding similarity between a comment and a relevant Wikipedia text of the identity term in the comment. We thoroughly evaluate our method on an extensive collection of four datasets collected from different </span>social media platforms<span>. Our results show that: (1) our models that incorporate both features of subjectivity and identity terms consistently outperform strong SOTA baselines, with our best performing model achieving an improvement in F1 of 4.75% over a Twitter dataset; (2) our idea of measuring subjectivity based on the similarity to the relevant Wikipedia text is very effective on toxic comment classification as our model using this has achieved the best performance on 3 out of 4 datasets while obtaining comparative performance on the remaining dataset. We further test our method on RoBERTa to evaluate the generality of our method and the results show the biggest improvement in F1 of up to 1.29% (on a dataset from a white supremacist online forum).</span></p></div>","PeriodicalId":52228,"journal":{"name":"Online Social Networks and Media","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117258021","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
期刊
Online Social Networks and Media
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1