Mikko Korkiakoski, Fatima Sadiq, Febrian Setianto, U. Latif, Paula Alavesa, Panos Kostakos
The surge of COVID-19 has introduced a new threat surface as malevolent actors are trying to benefit from the pandemic. Because of this, new information sources and visualization tools about COVID-19 have been introduced into the workflow of frontline practitioners. As a result, analysts are increasingly required to shift their focus between different visual displays to monitor pandemic related data, security threats, and incidents. Augmented reality (AR) smart glasses can overlay digital data to the physical environment in a comprehensible manner. However, the real-life use situations are often complex and require fast knowledge acquisition from multiple sources. In this study we report results from an experiment with six subjects using an AR overlaid information interface coupled with traditional computer monitors. Our goal was to evaluate a multi tasking setup with traditional monitors and an AR headset where notifications from the new COVID-19 MISP instance were visualized. Our results indicate that better situational awareness does translate to increased task performance, but at the cost of a gender gap that requires further attention.
{"title":"Using smart glasses for monitoring cyber threat intelligence feeds","authors":"Mikko Korkiakoski, Fatima Sadiq, Febrian Setianto, U. Latif, Paula Alavesa, Panos Kostakos","doi":"10.1145/3487351.3492722","DOIUrl":"https://doi.org/10.1145/3487351.3492722","url":null,"abstract":"The surge of COVID-19 has introduced a new threat surface as malevolent actors are trying to benefit from the pandemic. Because of this, new information sources and visualization tools about COVID-19 have been introduced into the workflow of frontline practitioners. As a result, analysts are increasingly required to shift their focus between different visual displays to monitor pandemic related data, security threats, and incidents. Augmented reality (AR) smart glasses can overlay digital data to the physical environment in a comprehensible manner. However, the real-life use situations are often complex and require fast knowledge acquisition from multiple sources. In this study we report results from an experiment with six subjects using an AR overlaid information interface coupled with traditional computer monitors. Our goal was to evaluate a multi tasking setup with traditional monitors and an AR headset where notifications from the new COVID-19 MISP instance were visualized. Our results indicate that better situational awareness does translate to increased task performance, but at the cost of a gender gap that requires further attention.","PeriodicalId":320904,"journal":{"name":"Proceedings of the 2021 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123327705","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Counting motifs in an uncertain graph for which each link is associated with a connection probability is computationally expensive when the graph is huge due to the extremely large number of possible worlds. Natural approach is to rely on sampling-based approximation methods, but this still needs many sample graphs for obtaining accurate results. We propose a novel method that analytically computes the expected frequency of motif without relying on expensive sampling. Marginalizing the probability of each possible world on a candidate motif can drastically reduce the number of possible worlds that need be considered when the size of motif is small. Experiments using real-world data confirm that the proposed method is effective and efficient. It is far better than the state-of-the-art sampling-based method. The accuracy is guaranteed and the running time is about 4 order of magnitude faster. It runs at a speed that does not depend on the connection probability.
{"title":"Efficient analytical computation of expected frequency of motifs of small size by marginalization in uncertain network","authors":"Takayasu Fushimi, Kazumi Saito, H. Motoda","doi":"10.1145/3487351.3488275","DOIUrl":"https://doi.org/10.1145/3487351.3488275","url":null,"abstract":"Counting motifs in an uncertain graph for which each link is associated with a connection probability is computationally expensive when the graph is huge due to the extremely large number of possible worlds. Natural approach is to rely on sampling-based approximation methods, but this still needs many sample graphs for obtaining accurate results. We propose a novel method that analytically computes the expected frequency of motif without relying on expensive sampling. Marginalizing the probability of each possible world on a candidate motif can drastically reduce the number of possible worlds that need be considered when the size of motif is small. Experiments using real-world data confirm that the proposed method is effective and efficient. It is far better than the state-of-the-art sampling-based method. The accuracy is guaranteed and the running time is about 4 order of magnitude faster. It runs at a speed that does not depend on the connection probability.","PeriodicalId":320904,"journal":{"name":"Proceedings of the 2021 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121522272","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
J. Kwarteng, S. Perfumi, T. Farrell, Miriam Fernández
"Misogynoir" refers to the specific forms of misogyny that Black women experience, which couple racism and sexism together. To better understand the online manifestations of this type of hate, and to propose methods that can automatically identify it, in this paper, we conduct a study on 4 cases of Black women in Tech reporting experiences of misogynoir on the Twitter platform. We follow the reactions to these cases (both supportive and non-supportive responses), and categorise them within a model of misogynoir that highlights experiences of Tone Policing, White Centring, Racial Gaslighting and Defensiveness. As an intersectional form of abusive or hateful speech, we investigate the possibilities and challenges to detect online instances of misogynoir in an automated way. We then conduct a closer qualitative analysis on messages of support and non-support to look at some of these categories in more detail. The purpose of this investigation is to understand responses to misogynoir online, including doubling down on misogynoir, engaging in performative allyship, and showing solidarity with Black women in tech.
{"title":"Misogynoir: public online response towards self-reported misogynoir","authors":"J. Kwarteng, S. Perfumi, T. Farrell, Miriam Fernández","doi":"10.1145/3487351.3488342","DOIUrl":"https://doi.org/10.1145/3487351.3488342","url":null,"abstract":"\"Misogynoir\" refers to the specific forms of misogyny that Black women experience, which couple racism and sexism together. To better understand the online manifestations of this type of hate, and to propose methods that can automatically identify it, in this paper, we conduct a study on 4 cases of Black women in Tech reporting experiences of misogynoir on the Twitter platform. We follow the reactions to these cases (both supportive and non-supportive responses), and categorise them within a model of misogynoir that highlights experiences of Tone Policing, White Centring, Racial Gaslighting and Defensiveness. As an intersectional form of abusive or hateful speech, we investigate the possibilities and challenges to detect online instances of misogynoir in an automated way. We then conduct a closer qualitative analysis on messages of support and non-support to look at some of these categories in more detail. The purpose of this investigation is to understand responses to misogynoir online, including doubling down on misogynoir, engaging in performative allyship, and showing solidarity with Black women in tech.","PeriodicalId":320904,"journal":{"name":"Proceedings of the 2021 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125118067","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Demand prediction is crucial for companies in the retail industry to increase their profit and customer satisfaction. Although recent studies show the success of state-of-art machine learning and deep learning models in demand prediction, enriching datasets using graph-based feature representations to improve demand forecasting models is still rare. In this study, we propose a demand forecasting model that forecasts demand with the usage of graph-based product embeddings. Unlike most of the existing methods, the sale information data is used to extract the relations and several relationships are utilized to construct graphs. Using the Node2Vec and GraphSAGE algorithms, five different embeddings are evaluated to reflect the different relationships of products. Extreme Gradient Boosting Regressor (XGBR) is preferred over other models because of the ability to handle high sparse data. In order to observe and compare the results of different models, we also implement Long Short Term Memory (LSTM). The performance is evaluated using a public retail dataset and the results show that the proposed model gives less error using Node2Vec graph-based embedding with XGBR.
{"title":"Enriching demand prediction with product relationship information using graph neural networks","authors":"Yaren Yilmaz, Ş. Öğüdücü","doi":"10.1145/3487351.3489477","DOIUrl":"https://doi.org/10.1145/3487351.3489477","url":null,"abstract":"Demand prediction is crucial for companies in the retail industry to increase their profit and customer satisfaction. Although recent studies show the success of state-of-art machine learning and deep learning models in demand prediction, enriching datasets using graph-based feature representations to improve demand forecasting models is still rare. In this study, we propose a demand forecasting model that forecasts demand with the usage of graph-based product embeddings. Unlike most of the existing methods, the sale information data is used to extract the relations and several relationships are utilized to construct graphs. Using the Node2Vec and GraphSAGE algorithms, five different embeddings are evaluated to reflect the different relationships of products. Extreme Gradient Boosting Regressor (XGBR) is preferred over other models because of the ability to handle high sparse data. In order to observe and compare the results of different models, we also implement Long Short Term Memory (LSTM). The performance is evaluated using a public retail dataset and the results show that the proposed model gives less error using Node2Vec graph-based embedding with XGBR.","PeriodicalId":320904,"journal":{"name":"Proceedings of the 2021 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining","volume":"6 22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116870654","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The influence maximization problem aims to find the best seeding set of nodes in a network to increase the influence spread, under various information diffusion models. Recent advances have shown the importance of the timing of the seeding and introduced the sequential seeding approach, determining a step-by-step cascade of activations. Our study explores a novel Deterministic Influence Maximization Approach (DIMA) for time-based sequential seeding dynamics in a threshold-based model. We examine the problem characteristics and formulate solutions optimizing a scheduled sequential seeding strategy. Based on a set of empirical simulations we demonstrate the properties of the deterministic sequential problem, incorporate three different mathematical programming formulations and provide an initial benchmark for optimization techniques.
{"title":"Deterministic influence maximization approach for sequential active marketing","authors":"Dmitri Goldenberg, Eyal Tzvi Tenzer","doi":"10.1145/3487351.3489474","DOIUrl":"https://doi.org/10.1145/3487351.3489474","url":null,"abstract":"The influence maximization problem aims to find the best seeding set of nodes in a network to increase the influence spread, under various information diffusion models. Recent advances have shown the importance of the timing of the seeding and introduced the sequential seeding approach, determining a step-by-step cascade of activations. Our study explores a novel Deterministic Influence Maximization Approach (DIMA) for time-based sequential seeding dynamics in a threshold-based model. We examine the problem characteristics and formulate solutions optimizing a scheduled sequential seeding strategy. Based on a set of empirical simulations we demonstrate the properties of the deterministic sequential problem, incorporate three different mathematical programming formulations and provide an initial benchmark for optimization techniques.","PeriodicalId":320904,"journal":{"name":"Proceedings of the 2021 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining","volume":"69 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129611007","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Meng-Chieh Lee, H. Nguyen, Dimitris Berberidis, V. Tseng, L. Akoglu
Given a set of node-labeled directed weighted graphs, how to find the most anomalous ones? How can we summarize the normal behavior in the database without losing information? We propose GAWD, for detecting anomalous graphs in directed weighted graph databases. The idea is to (1) iteratively identify the "best" substructure (i.e., subgraph or motif) that yields the largest compression when each of its occurrences is replaced by a super-node, and (2) score each graph by how much it compresses over iterations --- the more the compression, the lower the anomaly score. Different from existing work [1] on which we build, GAWD exhibits (i) a lossless graph encoding scheme, (ii) ability to handle numeric edge weights, (iii) interpretability by common patterns, and (iv) scalability with running time linear in input size. Experiments on four datasets injected with anomalies show that GAWD achieves significantly better results than state-of-the-art baselines.
{"title":"GAWD: graph anomaly detection in weighted directed graph databases","authors":"Meng-Chieh Lee, H. Nguyen, Dimitris Berberidis, V. Tseng, L. Akoglu","doi":"10.1145/3487351.3488325","DOIUrl":"https://doi.org/10.1145/3487351.3488325","url":null,"abstract":"Given a set of node-labeled directed weighted graphs, how to find the most anomalous ones? How can we summarize the normal behavior in the database without losing information? We propose GAWD, for detecting anomalous graphs in directed weighted graph databases. The idea is to (1) iteratively identify the \"best\" substructure (i.e., subgraph or motif) that yields the largest compression when each of its occurrences is replaced by a super-node, and (2) score each graph by how much it compresses over iterations --- the more the compression, the lower the anomaly score. Different from existing work [1] on which we build, GAWD exhibits (i) a lossless graph encoding scheme, (ii) ability to handle numeric edge weights, (iii) interpretability by common patterns, and (iv) scalability with running time linear in input size. Experiments on four datasets injected with anomalies show that GAWD achieves significantly better results than state-of-the-art baselines.","PeriodicalId":320904,"journal":{"name":"Proceedings of the 2021 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126213522","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We propose an approach inspired by the diffusion of innovations theory to model and characterize fake news sharing in social media through the lens of the different levels of influential factors (users, networks, and news). We address the problem of predicting fake news sharing as a classification task and demonstrate the potentials of the proposed features by achieving an AUROC of 0.97 and an average precision of 0.88, consistently outperforming baseline models with a higher margin (about 30% of AUROC). Also, we show that news-based features are the most effective at predicting real and fake news sharing, followed by the user- and network-based features.
{"title":"Are you influenced?: modeling the diffusion of fake news in social media","authors":"Abishai Joy, Anu Shrestha, Francesca Spezzano","doi":"10.1145/3487351.3488345","DOIUrl":"https://doi.org/10.1145/3487351.3488345","url":null,"abstract":"We propose an approach inspired by the diffusion of innovations theory to model and characterize fake news sharing in social media through the lens of the different levels of influential factors (users, networks, and news). We address the problem of predicting fake news sharing as a classification task and demonstrate the potentials of the proposed features by achieving an AUROC of 0.97 and an average precision of 0.88, consistently outperforming baseline models with a higher margin (about 30% of AUROC). Also, we show that news-based features are the most effective at predicting real and fake news sharing, followed by the user- and network-based features.","PeriodicalId":320904,"journal":{"name":"Proceedings of the 2021 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122770032","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Stephany Rajeh, M. Savonnet, É. Leclercq, H. Cherifi
It is of paramount importance to uncover influential nodes to control diffusion phenomena in a network. In recent works, there is a growing trend to investigate the role of the community structure to solve this issue. Up to now, the vast majority of the so-called community-aware centrality measures rely on non-overlapping community structure. However, in many real-world networks, such as social networks, the communities overlap. In other words, a node can belong to multiple communities. To overcome this drawback, we propose and investigate the "Overlapping Modularity Vitality" centrality measure. This extension of "Modularity Vitality" quantifies the community structure strength variation when removing a node. It allows identifying a node as a hub or a bridge based on its contribution to the overlapping modularity of a network. A comparative analysis with its non-overlapping version using the Susceptible-Infected-Recovered (SIR) epidemic diffusion model has been performed on a set of six real-world networks. Overall, Overlapping Modularity Vitality outperforms its alternative. These results illustrate the importance of incorporating knowledge about the overlapping community structure to identify influential nodes effectively. Moreover, one can use multiple ranking strategies as the two measures are signed. Results show that selecting the nodes with the top positive or the top absolute centrality values is more effective than choosing the ones with the maximum negative values to spread the epidemic.
{"title":"Identifying influential nodes using overlapping modularity vitality","authors":"Stephany Rajeh, M. Savonnet, É. Leclercq, H. Cherifi","doi":"10.1145/3487351.3488277","DOIUrl":"https://doi.org/10.1145/3487351.3488277","url":null,"abstract":"It is of paramount importance to uncover influential nodes to control diffusion phenomena in a network. In recent works, there is a growing trend to investigate the role of the community structure to solve this issue. Up to now, the vast majority of the so-called community-aware centrality measures rely on non-overlapping community structure. However, in many real-world networks, such as social networks, the communities overlap. In other words, a node can belong to multiple communities. To overcome this drawback, we propose and investigate the \"Overlapping Modularity Vitality\" centrality measure. This extension of \"Modularity Vitality\" quantifies the community structure strength variation when removing a node. It allows identifying a node as a hub or a bridge based on its contribution to the overlapping modularity of a network. A comparative analysis with its non-overlapping version using the Susceptible-Infected-Recovered (SIR) epidemic diffusion model has been performed on a set of six real-world networks. Overall, Overlapping Modularity Vitality outperforms its alternative. These results illustrate the importance of incorporating knowledge about the overlapping community structure to identify influential nodes effectively. Moreover, one can use multiple ranking strategies as the two measures are signed. Results show that selecting the nodes with the top positive or the top absolute centrality values is more effective than choosing the ones with the maximum negative values to spread the epidemic.","PeriodicalId":320904,"journal":{"name":"Proceedings of the 2021 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125641586","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Organizational risk and resilience as well as insider threat have been studied through the lenses of socio-psychological studies and information and computer sciences. As with all disciplines, it is an area in which practitioners, enthusiasts, and experts discuss the theory, issues, and solutions of the field in various online public forums. Such conversations, despite their public nature, can be difficult to understand and to study, even by those deeply involved in the communities themselves. Who are the key actors? How can we understand and characterize the culture around such communities, the problems they face, and the solutions favored by the experts in the field? Which narratives are being created and propagated, and by whom - and are these actors truly people, or are they autonomous agents, or "bots"? In this paper, we demonstrate the value in applying dynamic network analysis and social network analysis to gain situational awareness of the public conversation around insider threat, nation-state espionage, and industrial espionage. Characterizing public discourse around a topic can reveal individuals and organizations attempting to push or shape narratives in ways that might not be obvious to casual observation. Such techniques have been used to great effect in the study of elections, the COVID-19 pandemic, and the study of misinformation and disinformation, and we hope to show that their use in this area is a powerful way to build a foundation of understanding around the conversations in the online public forum, provide data and analysis for use in further research, and equip counter insider threat practitioners with new insights.
{"title":"Conversations around organizational risk and insider threat","authors":"Luke J. Osterritter, Kathleen M. Carley","doi":"10.1145/3487351.3492721","DOIUrl":"https://doi.org/10.1145/3487351.3492721","url":null,"abstract":"Organizational risk and resilience as well as insider threat have been studied through the lenses of socio-psychological studies and information and computer sciences. As with all disciplines, it is an area in which practitioners, enthusiasts, and experts discuss the theory, issues, and solutions of the field in various online public forums. Such conversations, despite their public nature, can be difficult to understand and to study, even by those deeply involved in the communities themselves. Who are the key actors? How can we understand and characterize the culture around such communities, the problems they face, and the solutions favored by the experts in the field? Which narratives are being created and propagated, and by whom - and are these actors truly people, or are they autonomous agents, or \"bots\"? In this paper, we demonstrate the value in applying dynamic network analysis and social network analysis to gain situational awareness of the public conversation around insider threat, nation-state espionage, and industrial espionage. Characterizing public discourse around a topic can reveal individuals and organizations attempting to push or shape narratives in ways that might not be obvious to casual observation. Such techniques have been used to great effect in the study of elections, the COVID-19 pandemic, and the study of misinformation and disinformation, and we hope to show that their use in this area is a powerful way to build a foundation of understanding around the conversations in the online public forum, provide data and analysis for use in further research, and equip counter insider threat practitioners with new insights.","PeriodicalId":320904,"journal":{"name":"Proceedings of the 2021 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130833382","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
An ongoing challenge in network science is influence maximization (IM), which sets out to define those nodes which maximize the dissemination of influence. Most of the recent research proposals on the IM problem offer solutions that are still highly time consuming for usage in the context of real-world complex networks. This article develops a novel seed selection framework based on the principle of maximizing influence at the community level with an emphasis on global homogeneous seed spacing. Our proposed framework, called Colonise, consists of the following stages: (i) community tuning, (ii) node centrality computation, and (iii) seed assignment. Particularly, phase (i) iteratively breaks down the network into communities, using the Louvain method, based on the number of desired seeds; phase (ii) measures a target node centrality on each community to reduce the number of seed candidates; phase (iii) assigns nodes as seeds from the highest centrality nodes found in each community. In contrast to global centrality-based seed selection, we exploit the structure of communities and circumvent overlapped assignment, such that we select efficiently the number of seed nodes to boost information diffusion. The simulation results---based on 12 diverse synthetic and real-world networks, and employing the SIR epidemic model---prove that our proposed Colonise algorithm surpasses state-of-the-art selection methods in all simulated scenarios, with an increased diffusion efficiency ranging between +0.15% up to +173.53% (22.36% on average), without compromising either diffusion coverage or speed.
{"title":"Fast colonization algorithm for seed selection in complex networks based on community detection","authors":"Alexandru Topîrceanu, M. Udrescu","doi":"10.1145/3487351.3488319","DOIUrl":"https://doi.org/10.1145/3487351.3488319","url":null,"abstract":"An ongoing challenge in network science is influence maximization (IM), which sets out to define those nodes which maximize the dissemination of influence. Most of the recent research proposals on the IM problem offer solutions that are still highly time consuming for usage in the context of real-world complex networks. This article develops a novel seed selection framework based on the principle of maximizing influence at the community level with an emphasis on global homogeneous seed spacing. Our proposed framework, called Colonise, consists of the following stages: (i) community tuning, (ii) node centrality computation, and (iii) seed assignment. Particularly, phase (i) iteratively breaks down the network into communities, using the Louvain method, based on the number of desired seeds; phase (ii) measures a target node centrality on each community to reduce the number of seed candidates; phase (iii) assigns nodes as seeds from the highest centrality nodes found in each community. In contrast to global centrality-based seed selection, we exploit the structure of communities and circumvent overlapped assignment, such that we select efficiently the number of seed nodes to boost information diffusion. The simulation results---based on 12 diverse synthetic and real-world networks, and employing the SIR epidemic model---prove that our proposed Colonise algorithm surpasses state-of-the-art selection methods in all simulated scenarios, with an increased diffusion efficiency ranging between +0.15% up to +173.53% (22.36% on average), without compromising either diffusion coverage or speed.","PeriodicalId":320904,"journal":{"name":"Proceedings of the 2021 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining","volume":"93 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130922585","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}