Chunyan An, Yunhan Li, Qiang Yang, Winston K. G. Seah, Zhixu Li, Conghao Yanga
Session-based Social Recommendation (SSR) leverages social relationships within online networks to enhance the performance of Session-based Recommendation (SR). However, existing SSR algorithms often encounter the challenge of ``friend data sparsity''. Moreover, significant discrepancies can exist between the purchase preferences of social network friends and those of the target user, reducing the influence of friends relative to the target user's own preferences. To address these challenges, this paper introduces the concept of ``Like-minded Peers'' (LMP), representing users whose preferences align with the target user's current session based on their historical sessions. This is the first work, to our knowledge, that uses LMP to enhance the modeling of social influence in SSR. This approach not only alleviates the problem of friend data sparsity but also effectively incorporates users with similar preferences to the target user. We propose a novel model named Transformer Encoder with Graph Attention Aggregator Recommendation (TEGAARec), which includes the TEGAA module and the GAT-based social aggregation module. The TEGAA module captures and merges both long-term and short-term interests for target users and LMP users. Concurrently, the GAT-based social aggregation module is designed to aggregate the target users' dynamic interests and social influence in a weighted manner. Extensive experiments on four real-world datasets demonstrate the efficacy and superiority of our proposed model and ablation studies are done to illustrate the contributions of each component in TEGAARec.
基于会话的社交推荐(SSR)利用在线网络中的社交关系来提高基于会话的推荐(SR)的性能。然而,现有的会话社交推荐算法经常遇到 "好友数据稀少 "的挑战。此外,社交网络好友的购买偏好与目标用户的购买偏好之间可能存在巨大差异,从而降低了好友相对于目标用户自身偏好的影响力。为了应对这些挑战,本文引入了 "志同道合的同伴"(LMP)的概念,根据目标用户的历史会话,代表其偏好与目标用户当前会话一致的用户。据我们所知,这是第一项使用 LMP 来增强 SSR 中社会影响力建模的工作。这种方法不仅缓解了好友数据稀少的问题,还有效地将与目标用户具有相似偏好的用户纳入其中。我们提出了一种名为 "图关注聚合推荐"(TEGAARec)的新型模型,其中包括 TEGAA 模块和基于 GAT 的社交聚合模块。同时,基于 GAT 的社交聚合模块旨在以加权方式聚合目标用户的动态兴趣和社交影响力。在四个真实世界数据集上进行的广泛实验证明了我们提出的模型的有效性和优越性,并进行了相关研究,以说明 TEGAARec 中每个组件的贡献。
{"title":"Incorporating Like-Minded Peers to Overcome Friend Data Sparsity in Session-Based Social Recommendations","authors":"Chunyan An, Yunhan Li, Qiang Yang, Winston K. G. Seah, Zhixu Li, Conghao Yanga","doi":"arxiv-2409.02702","DOIUrl":"https://doi.org/arxiv-2409.02702","url":null,"abstract":"Session-based Social Recommendation (SSR) leverages social relationships\u0000within online networks to enhance the performance of Session-based\u0000Recommendation (SR). However, existing SSR algorithms often encounter the\u0000challenge of ``friend data sparsity''. Moreover, significant discrepancies can\u0000exist between the purchase preferences of social network friends and those of\u0000the target user, reducing the influence of friends relative to the target\u0000user's own preferences. To address these challenges, this paper introduces the\u0000concept of ``Like-minded Peers'' (LMP), representing users whose preferences\u0000align with the target user's current session based on their historical\u0000sessions. This is the first work, to our knowledge, that uses LMP to enhance\u0000the modeling of social influence in SSR. This approach not only alleviates the\u0000problem of friend data sparsity but also effectively incorporates users with\u0000similar preferences to the target user. We propose a novel model named\u0000Transformer Encoder with Graph Attention Aggregator Recommendation (TEGAARec),\u0000which includes the TEGAA module and the GAT-based social aggregation module.\u0000The TEGAA module captures and merges both long-term and short-term interests\u0000for target users and LMP users. Concurrently, the GAT-based social aggregation\u0000module is designed to aggregate the target users' dynamic interests and social\u0000influence in a weighted manner. Extensive experiments on four real-world\u0000datasets demonstrate the efficacy and superiority of our proposed model and\u0000ablation studies are done to illustrate the contributions of each component in\u0000TEGAARec.","PeriodicalId":501032,"journal":{"name":"arXiv - CS - Social and Information Networks","volume":"470 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142227853","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Michael Achmann-Denkler, Jakob Fehle, Mario Haim, Christian Wolff
This study investigates the automated classification of Calls to Action (CTAs) within the 2021 German Instagram election campaign to advance the understanding of mobilization in social media contexts. We analyzed over 2,208 Instagram stories and 712 posts using fine-tuned BERT models and OpenAI's GPT-4 models. The fine-tuned BERT model incorporating synthetic training data achieved a macro F1 score of 0.93, demonstrating a robust classification performance. Our analysis revealed that 49.58% of Instagram posts and 10.64% of stories contained CTAs, highlighting significant differences in mobilization strategies between these content types. Additionally, we found that FDP and the Greens had the highest prevalence of CTAs in posts, whereas CDU and CSU led in story CTAs.
{"title":"Detecting Calls to Action in Multimodal Content: Analysis of the 2021 German Federal Election Campaign on Instagram","authors":"Michael Achmann-Denkler, Jakob Fehle, Mario Haim, Christian Wolff","doi":"arxiv-2409.02690","DOIUrl":"https://doi.org/arxiv-2409.02690","url":null,"abstract":"This study investigates the automated classification of Calls to Action\u0000(CTAs) within the 2021 German Instagram election campaign to advance the\u0000understanding of mobilization in social media contexts. We analyzed over 2,208\u0000Instagram stories and 712 posts using fine-tuned BERT models and OpenAI's GPT-4\u0000models. The fine-tuned BERT model incorporating synthetic training data\u0000achieved a macro F1 score of 0.93, demonstrating a robust classification\u0000performance. Our analysis revealed that 49.58% of Instagram posts and 10.64% of\u0000stories contained CTAs, highlighting significant differences in mobilization\u0000strategies between these content types. Additionally, we found that FDP and the\u0000Greens had the highest prevalence of CTAs in posts, whereas CDU and CSU led in\u0000story CTAs.","PeriodicalId":501032,"journal":{"name":"arXiv - CS - Social and Information Networks","volume":"12 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142226902","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jeremy Kepner, Hayden Jananthan, Michael Jones, William Arcand, David Bestor, William Bergeron, Daniel Burrill, Aydin Buluc, Chansup Byun, Timothy Davis, Vijay Gadepally, Daniel Grant, Michael Houle, Matthew Hubbell, Piotr Luszczek, Lauren Milechin, Chasen Milner, Guillermo Morales, Andrew Morris, Julie Mullen, Ritesh Patel, Alex Pentland, Sandeep Pisharody, Andrew Prout, Albert Reuther, Antonio Rosa, Gabriel Wachman, Charles Yee, Peter Michaleas
Understanding what is normal is a key aspect of protecting a domain. Other domains invest heavily in observational science to develop models of normal behavior to better detect anomalies. Recent advances in high performance graph libraries, such as the GraphBLAS, coupled with supercomputers enables processing of the trillions of observations required. We leverage this approach to synthesize low-parameter observational models of anonymized Internet traffic with a high regard for privacy.
{"title":"What is Normal? A Big Data Observational Science Model of Anonymized Internet Traffic","authors":"Jeremy Kepner, Hayden Jananthan, Michael Jones, William Arcand, David Bestor, William Bergeron, Daniel Burrill, Aydin Buluc, Chansup Byun, Timothy Davis, Vijay Gadepally, Daniel Grant, Michael Houle, Matthew Hubbell, Piotr Luszczek, Lauren Milechin, Chasen Milner, Guillermo Morales, Andrew Morris, Julie Mullen, Ritesh Patel, Alex Pentland, Sandeep Pisharody, Andrew Prout, Albert Reuther, Antonio Rosa, Gabriel Wachman, Charles Yee, Peter Michaleas","doi":"arxiv-2409.03111","DOIUrl":"https://doi.org/arxiv-2409.03111","url":null,"abstract":"Understanding what is normal is a key aspect of protecting a domain. Other\u0000domains invest heavily in observational science to develop models of normal\u0000behavior to better detect anomalies. Recent advances in high performance graph\u0000libraries, such as the GraphBLAS, coupled with supercomputers enables\u0000processing of the trillions of observations required. We leverage this approach\u0000to synthesize low-parameter observational models of anonymized Internet traffic\u0000with a high regard for privacy.","PeriodicalId":501032,"journal":{"name":"arXiv - CS - Social and Information Networks","volume":"25 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142214719","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
With the rapid development of social media, the importance of analyzing social network user data has also been put on the agenda. User representation learning in social media is a critical area of research, based on which we can conduct personalized content delivery, or detect malicious actors. Being more complicated than many other types of data, social network user data has inherent multimodal nature. Various multimodal approaches have been proposed to harness both text (i.e. post content) and relation (i.e. inter-user interaction) information to learn user embeddings of higher quality. The advent of Graph Neural Network models enables more end-to-end integration of user text embeddings and user interaction graphs in social networks. However, most of those approaches do not adequately elucidate which aspects of the data - text or graph structure information - are more helpful for predicting each specific user under a particular task, putting some burden on personalized downstream analysis and untrustworthy information filtering. We propose a simple yet effective framework called Contribution-Aware Multimodal User Embedding (CAMUE) for social networks. We have demonstrated with empirical evidence, that our approach can provide personalized explainable predictions, automatically mitigating the impact of unreliable information. We also conducted case studies to show how reasonable our results are. We observe that for most users, graph structure information is more trustworthy than text information, but there are some reasonable cases where text helps more. Our work paves the way for more explainable, reliable, and effective social media user embedding which allows for better personalized content delivery.
{"title":"Do We Trust What They Say or What They Do? A Multimodal User Embedding Provides Personalized Explanations","authors":"Zhicheng Ren, Zhiping Xiao, Yizhou Sun","doi":"arxiv-2409.02965","DOIUrl":"https://doi.org/arxiv-2409.02965","url":null,"abstract":"With the rapid development of social media, the importance of analyzing\u0000social network user data has also been put on the agenda. User representation\u0000learning in social media is a critical area of research, based on which we can\u0000conduct personalized content delivery, or detect malicious actors. Being more\u0000complicated than many other types of data, social network user data has\u0000inherent multimodal nature. Various multimodal approaches have been proposed to\u0000harness both text (i.e. post content) and relation (i.e. inter-user\u0000interaction) information to learn user embeddings of higher quality. The advent\u0000of Graph Neural Network models enables more end-to-end integration of user text\u0000embeddings and user interaction graphs in social networks. However, most of\u0000those approaches do not adequately elucidate which aspects of the data - text\u0000or graph structure information - are more helpful for predicting each specific\u0000user under a particular task, putting some burden on personalized downstream\u0000analysis and untrustworthy information filtering. We propose a simple yet\u0000effective framework called Contribution-Aware Multimodal User Embedding (CAMUE)\u0000for social networks. We have demonstrated with empirical evidence, that our\u0000approach can provide personalized explainable predictions, automatically\u0000mitigating the impact of unreliable information. We also conducted case studies\u0000to show how reasonable our results are. We observe that for most users, graph\u0000structure information is more trustworthy than text information, but there are\u0000some reasonable cases where text helps more. Our work paves the way for more\u0000explainable, reliable, and effective social media user embedding which allows\u0000for better personalized content delivery.","PeriodicalId":501032,"journal":{"name":"arXiv - CS - Social and Information Networks","volume":"41 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142214722","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yasamin Tabatabaee, Eleanor Wedell, Minhyuk Park, Tandy Warnow
Many community detection algorithms are stochastic in nature, and their output can vary based on different input parameters and random seeds. Consensus clustering methods, such as FastConsensus and ECG, combine clusterings from multiple runs of the same clustering algorithm, in order to improve stability and accuracy. In this study we present a new consensus clustering method, FastEnsemble, and show that it provides advantages over both FastConsensus and ECG. Furthermore, FastEnsemble is designed for use with any clustering method, and we show results using ourmethod with Leiden optimizing modularity or the Constant Potts model. FastEnsemble is available in Github at https://github.com/ytabatabaee/fast-ensemble
{"title":"FastEnsemble: A new scalable ensemble clustering method","authors":"Yasamin Tabatabaee, Eleanor Wedell, Minhyuk Park, Tandy Warnow","doi":"arxiv-2409.02077","DOIUrl":"https://doi.org/arxiv-2409.02077","url":null,"abstract":"Many community detection algorithms are stochastic in nature, and their\u0000output can vary based on different input parameters and random seeds. Consensus\u0000clustering methods, such as FastConsensus and ECG, combine clusterings from\u0000multiple runs of the same clustering algorithm, in order to improve stability\u0000and accuracy. In this study we present a new consensus clustering method,\u0000FastEnsemble, and show that it provides advantages over both FastConsensus and\u0000ECG. Furthermore, FastEnsemble is designed for use with any clustering method,\u0000and we show results using ourmethod with Leiden optimizing modularity or the\u0000Constant Potts model. FastEnsemble is available in Github at\u0000https://github.com/ytabatabaee/fast-ensemble","PeriodicalId":501032,"journal":{"name":"arXiv - CS - Social and Information Networks","volume":"120 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142214725","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We introduce the Tidal Tales Plugin, a Firefox extension for efficiently collecting and archiving of Instagram stories, addressing the challenges of ephemeral data in social media research. It enables an automated collection of story metadata and media files without risking account bans. It contributes to Web Science by facilitating expansive, long-term studies with enhanced data access and integrity.
{"title":"Preserving the Ephemeral: Instagram Story Archiving with the Tidal Tales Plugin","authors":"Michael Achmann-Denkler, Christian Wolff","doi":"arxiv-2409.01880","DOIUrl":"https://doi.org/arxiv-2409.01880","url":null,"abstract":"We introduce the Tidal Tales Plugin, a Firefox extension for efficiently\u0000collecting and archiving of Instagram stories, addressing the challenges of\u0000ephemeral data in social media research. It enables an automated collection of\u0000story metadata and media files without risking account bans. It contributes to\u0000Web Science by facilitating expansive, long-term studies with enhanced data\u0000access and integrity.","PeriodicalId":501032,"journal":{"name":"arXiv - CS - Social and Information Networks","volume":"31 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142214917","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
When designing a public transportation network in a country, one may want to minimise the sum of travel duration of all inhabitants. This corresponds to a purely utilitarian view and does not involve any fairness consideration, as the resulting network will typically benefit the capital city and/or large central cities while leaving some peripheral cities behind. On the other hand, a more egalitarian view will allow some people to travel between peripheral cities without having to go through a central city. We define a model, propose algorithms for computing solution networks, and report on experiments based on real data.
{"title":"Fair Railway Network Design","authors":"Zixu He, Sirin Botan, Jérôme Lang, Abdallah Saffidine, Florian Sikora, Silas Workman","doi":"arxiv-2409.02152","DOIUrl":"https://doi.org/arxiv-2409.02152","url":null,"abstract":"When designing a public transportation network in a country, one may want to\u0000minimise the sum of travel duration of all inhabitants. This corresponds to a\u0000purely utilitarian view and does not involve any fairness consideration, as the\u0000resulting network will typically benefit the capital city and/or large central\u0000cities while leaving some peripheral cities behind. On the other hand, a more\u0000egalitarian view will allow some people to travel between peripheral cities\u0000without having to go through a central city. We define a model, propose\u0000algorithms for computing solution networks, and report on experiments based on\u0000real data.","PeriodicalId":501032,"journal":{"name":"arXiv - CS - Social and Information Networks","volume":"2 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142214724","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
How do Wikipedians maintain an accurate encyclopedia during an ongoing geopolitical conflict where state actors might seek to spread disinformation or conduct an information operation? In the context of the Russia-Ukraine War, this question becomes more pressing, given the Russian government's extensive history of orchestrating information campaigns. We conducted an interview study with 13 expert Wikipedians involved in the Russo-Ukrainian War topic area on the English-language edition of Wikipedia. While our participants did not perceive there to be clear evidence of a state-backed information operation, they agreed that war-related articles experienced high levels of disruptive editing from both Russia-aligned and Ukraine-aligned accounts. The English-language edition of Wikipedia had existing policies and processes at its disposal to counter such disruption. State-backed or not, the disruptive activity created time-intensive maintenance work for our participants. Finally, participants considered English-language Wikipedia to be more resilient than social media in preventing the spread of false information online. We conclude by discussing sociotechnical implications for Wikipedia and social platforms.
{"title":"Wikipedia in Wartime: Experiences of Wikipedians Maintaining Articles About the Russia-Ukraine War","authors":"Laura Kurek, Ceren Budak, Eric Gilbert","doi":"arxiv-2409.02304","DOIUrl":"https://doi.org/arxiv-2409.02304","url":null,"abstract":"How do Wikipedians maintain an accurate encyclopedia during an ongoing\u0000geopolitical conflict where state actors might seek to spread disinformation or\u0000conduct an information operation? In the context of the Russia-Ukraine War,\u0000this question becomes more pressing, given the Russian government's extensive\u0000history of orchestrating information campaigns. We conducted an interview study\u0000with 13 expert Wikipedians involved in the Russo-Ukrainian War topic area on\u0000the English-language edition of Wikipedia. While our participants did not\u0000perceive there to be clear evidence of a state-backed information operation,\u0000they agreed that war-related articles experienced high levels of disruptive\u0000editing from both Russia-aligned and Ukraine-aligned accounts. The\u0000English-language edition of Wikipedia had existing policies and processes at\u0000its disposal to counter such disruption. State-backed or not, the disruptive\u0000activity created time-intensive maintenance work for our participants. Finally,\u0000participants considered English-language Wikipedia to be more resilient than\u0000social media in preventing the spread of false information online. We conclude\u0000by discussing sociotechnical implications for Wikipedia and social platforms.","PeriodicalId":501032,"journal":{"name":"arXiv - CS - Social and Information Networks","volume":"3 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142226909","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Johannes Niu, Mila Stillman, Philipp Seeberger, Anna Kruspe
Open Source Intelligence (OSINT) refers to intelligence efforts based on freely available data. It has become a frequent topic of conversation on social media, where private users or networks can share their findings. Such data is highly valuable in conflicts, both for gaining a new understanding of the situation as well as for tracking the spread of misinformation. In this paper, we present a method for collecting such data as well as a novel OSINT dataset for the Russo-Ukrainian war drawn from Twitter between January 2022 and July 2023. It is based on an initial search of users posting OSINT and a subsequent snowballing approach to detect more. The final dataset contains almost 2 million Tweets posted by 1040 users. We also provide some first analyses and experiments on the data, and make suggestions for its future usage.
{"title":"A dataset of Open Source Intelligence (OSINT) Tweets about the Russo-Ukrainian war","authors":"Johannes Niu, Mila Stillman, Philipp Seeberger, Anna Kruspe","doi":"arxiv-2409.01052","DOIUrl":"https://doi.org/arxiv-2409.01052","url":null,"abstract":"Open Source Intelligence (OSINT) refers to intelligence efforts based on\u0000freely available data. It has become a frequent topic of conversation on social\u0000media, where private users or networks can share their findings. Such data is\u0000highly valuable in conflicts, both for gaining a new understanding of the\u0000situation as well as for tracking the spread of misinformation. In this paper,\u0000we present a method for collecting such data as well as a novel OSINT dataset\u0000for the Russo-Ukrainian war drawn from Twitter between January 2022 and July\u00002023. It is based on an initial search of users posting OSINT and a subsequent\u0000snowballing approach to detect more. The final dataset contains almost 2\u0000million Tweets posted by 1040 users. We also provide some first analyses and\u0000experiments on the data, and make suggestions for its future usage.","PeriodicalId":501032,"journal":{"name":"arXiv - CS - Social and Information Networks","volume":"72 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142214726","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Giulia Preti, Matteo Riondato, Aristides Gionis, Gianmarco De Francisci Morales
We introduce Polaris, a network null model for colored multi-graphs that preserves the Joint Color Matrix. Polaris is specifically designed for studying network polarization, where vertices belong to a side in a debate or a partisan group, represented by a vertex color, and relations have different strengths, represented by an integer-valued edge multiplicity. The key feature of Polaris is preserving the Joint Color Matrix (JCM) of the multigraph, which specifies the number of edges connecting vertices of any two given colors. The JCM is the basic property that determines color assortativity, a fundamental aspect in studying homophily and segregation in polarized networks. By using Polaris, network scientists can test whether a phenomenon is entirely explained by the JCM of the observed network or whether other phenomena might be at play. Technically, our null model is an extension of the configuration model: an ensemble of colored multigraphs characterized by the same degree sequence and the same JCM. To sample from this ensemble, we develop a suite of Markov Chain Monte Carlo algorithms, collectively named Polaris-*. It includes Polaris-B, an adaptation of a generic Metropolis-Hastings algorithm, and Polaris-C, a faster, specialized algorithm with higher acceptance probabilities. This new null model and the associated algorithms provide a more nuanced toolset for examining polarization in social networks, thus enabling statistically sound conclusions.
{"title":"Polaris: Sampling from the Multigraph Configuration Model with Prescribed Color Assortativity","authors":"Giulia Preti, Matteo Riondato, Aristides Gionis, Gianmarco De Francisci Morales","doi":"arxiv-2409.01363","DOIUrl":"https://doi.org/arxiv-2409.01363","url":null,"abstract":"We introduce Polaris, a network null model for colored multi-graphs that\u0000preserves the Joint Color Matrix. Polaris is specifically designed for studying\u0000network polarization, where vertices belong to a side in a debate or a partisan\u0000group, represented by a vertex color, and relations have different strengths,\u0000represented by an integer-valued edge multiplicity. The key feature of Polaris\u0000is preserving the Joint Color Matrix (JCM) of the multigraph, which specifies\u0000the number of edges connecting vertices of any two given colors. The JCM is the\u0000basic property that determines color assortativity, a fundamental aspect in\u0000studying homophily and segregation in polarized networks. By using Polaris,\u0000network scientists can test whether a phenomenon is entirely explained by the\u0000JCM of the observed network or whether other phenomena might be at play.\u0000Technically, our null model is an extension of the configuration model: an\u0000ensemble of colored multigraphs characterized by the same degree sequence and\u0000the same JCM. To sample from this ensemble, we develop a suite of Markov Chain\u0000Monte Carlo algorithms, collectively named Polaris-*. It includes Polaris-B, an\u0000adaptation of a generic Metropolis-Hastings algorithm, and Polaris-C, a faster,\u0000specialized algorithm with higher acceptance probabilities. This new null model\u0000and the associated algorithms provide a more nuanced toolset for examining\u0000polarization in social networks, thus enabling statistically sound conclusions.","PeriodicalId":501032,"journal":{"name":"arXiv - CS - Social and Information Networks","volume":"15 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142214729","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}