Pub Date : 2024-03-22DOI: 10.1140/epjds/s13688-024-00463-4
Mohamed Amine Bouzaghrane, Hassan Obeid, Marta González, Joan Walker
Despite the historically documented regularity in human mobility patterns, the relaxation of spatial and temporal constraints, brought by the widespread adoption of telecommuting and e-commerce during the COVID-19 pandemic, as well as a growing desire for flexible work arrangements in a post-pandemic work, indicates a potential reshaping of these patterns. In this paper, we investigate the multifaceted impacts of relaxed spatio-temporal constraints on human mobility, using well-established metrics from the travel behavior literature. Further, we introduce a novel metric for schedule regularity, accounting for specific day-of-week characteristics that previous approaches overlooked. Building on the large body of literature on the impacts of COVID-19 on human mobility, we make use of passively tracked Point of Interest (POI) data for approximately 21,700 smartphone users in the US, and analyze data between January 2020 and September 2022 to answer two key questions: (1) has the COVID-19 pandemic and its associated relaxation of spatio-temporal activity patterns reshaped the different aspects of human mobility, and (2) have we achieved a state of stable post-pandemic “new normal”? We hypothesize that the relaxation of the spatiotemporal constraints around key activities will result in people exhibiting less regular schedules. Findings reveal a complex landscape: while some mobility indicators have reverted to pre-pandemic norms, such as trip frequency and travel distance, others, notably at-home dwell-time, persist at altered levels, suggesting a recalibration rather than a return to past behaviors. Most notably, our analysis reveals a paradox: despite the documented large-scale shift towards flexible work arrangements, schedule habits have strengthened rather than relaxed, defying our initial hypotheses and highlighting a desire for regularity. The study’s results contribute to a deeper understanding of the post-pandemic “new normal”, offering key insights on how multiple facets of travel behavior were reshaped, if at all, by the COVID-19 pandemic, and will help inform transportation planning in a post-pandemic world.
{"title":"Human mobility reshaped? Deciphering the impacts of the Covid-19 pandemic on activity patterns, spatial habits, and schedule habits","authors":"Mohamed Amine Bouzaghrane, Hassan Obeid, Marta González, Joan Walker","doi":"10.1140/epjds/s13688-024-00463-4","DOIUrl":"https://doi.org/10.1140/epjds/s13688-024-00463-4","url":null,"abstract":"<p>Despite the historically documented regularity in human mobility patterns, the relaxation of spatial and temporal constraints, brought by the widespread adoption of telecommuting and e-commerce during the COVID-19 pandemic, as well as a growing desire for flexible work arrangements in a post-pandemic work, indicates a potential reshaping of these patterns. In this paper, we investigate the multifaceted impacts of relaxed spatio-temporal constraints on human mobility, using well-established metrics from the travel behavior literature. Further, we introduce a novel metric for schedule regularity, accounting for specific day-of-week characteristics that previous approaches overlooked. Building on the large body of literature on the impacts of COVID-19 on human mobility, we make use of passively tracked Point of Interest (POI) data for approximately 21,700 smartphone users in the US, and analyze data between January 2020 and September 2022 to answer two key questions: (1) has the COVID-19 pandemic and its associated relaxation of spatio-temporal activity patterns reshaped the different aspects of human mobility, and (2) have we achieved a state of stable post-pandemic “new normal”? We hypothesize that the relaxation of the spatiotemporal constraints around key activities will result in people exhibiting less regular schedules. Findings reveal a complex landscape: while some mobility indicators have reverted to pre-pandemic norms, such as trip frequency and travel distance, others, notably at-home dwell-time, persist at altered levels, suggesting a recalibration rather than a return to past behaviors. Most notably, our analysis reveals a paradox: despite the documented large-scale shift towards flexible work arrangements, schedule habits have strengthened rather than relaxed, defying our initial hypotheses and highlighting a desire for regularity. The study’s results contribute to a deeper understanding of the post-pandemic “new normal”, offering key insights on how multiple facets of travel behavior were reshaped, if at all, by the COVID-19 pandemic, and will help inform transportation planning in a post-pandemic world.</p>","PeriodicalId":11887,"journal":{"name":"EPJ Data Science","volume":"122 1","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140197516","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-03-21DOI: 10.1140/epjds/s13688-024-00459-0
Abstract
Automated positioning devices can generate large datasets with information on the movement of humans, animals and objects, revealing patterns of movement, hot spots and overlaps among others. However, in the case of Automated Information Systems (AIS), attached to vessels, observed strange behaviors in the tracking datasets may come from intentional manipulation of the electronic devices. Thus, the analysis of anomalies can provide valuable information on suspicious behavior. Here, we analyze anomalies of fishing vessel trajectories obtained with the Automatic Identification System. The map of silent anomalies, those that occur when positioning data are absent for more than 24 hours, shows that they are most likely to occur closer to land, with 87.1% of anomalies observed within 100 km of the coast. This behavior suggests the potential of identifying silence anomalies as a proxy for illegal activities. With the increasing availability of high-resolution positioning of vessels and the development of powerful statistical analytical tools, we provide hints on the automatic detection of illegal activities that may help optimize the management of fishing resources.
{"title":"Identification of suspicious behavior through anomalies in the tracking data of fishing vessels","authors":"","doi":"10.1140/epjds/s13688-024-00459-0","DOIUrl":"https://doi.org/10.1140/epjds/s13688-024-00459-0","url":null,"abstract":"<h3>Abstract</h3> <p>Automated positioning devices can generate large datasets with information on the movement of humans, animals and objects, revealing patterns of movement, hot spots and overlaps among others. However, in the case of Automated Information Systems (AIS), attached to vessels, observed strange behaviors in the tracking datasets may come from intentional manipulation of the electronic devices. Thus, the analysis of anomalies can provide valuable information on suspicious behavior. Here, we analyze anomalies of fishing vessel trajectories obtained with the Automatic Identification System. The map of silent anomalies, those that occur when positioning data are absent for more than 24 hours, shows that they are most likely to occur closer to land, with 87.1% of anomalies observed within 100 km of the coast. This behavior suggests the potential of identifying silence anomalies as a proxy for illegal activities. With the increasing availability of high-resolution positioning of vessels and the development of powerful statistical analytical tools, we provide hints on the automatic detection of illegal activities that may help optimize the management of fishing resources.</p>","PeriodicalId":11887,"journal":{"name":"EPJ Data Science","volume":"3 1","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-03-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140197548","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Modeling human mobility helps to understand how people are accessing resources and physically contacting with each other in cities, and thus contributes to various applications such as urban planning, epidemic control, and location-based advertisement. Next location prediction is one decisive task in individual human mobility modeling and is usually viewed as sequence modeling, solved with Markov or RNN-based methods. However, the existing models paid little attention to the logic of individual travel decisions and the reproducibility of the collective behavior of population. To this end, we propose a Causal and Spatial-constrained Long and Short-term Learner (CSLSL) for next location prediction. CSLSL utilizes a causal structure based on multi-task learning to explicitly model the “when→what→where”, a.k.a. “time→activity→location” decision logic. We next propose a spatial-constrained loss function as an auxiliary task, to ensure the consistency between the predicted and actual spatial distribution of travelers’ destinations. Moreover, CSLSL adopts modules named Long and Short-term Capturer (LSC) to learn the transition regularities across different time spans. Extensive experiments on three real-world datasets show promising performance improvements of CSLSL over baselines and confirm the effectiveness of introducing the causality and consistency constraints. The implementation is available at https://github.com/urbanmobility/CSLSL.
{"title":"Human mobility prediction with causal and spatial-constrained multi-task network","authors":"Zongyuan Huang, Shengyuan Xu, Menghan Wang, Hansi Wu, Yanyan Xu, Yaohui Jin","doi":"10.1140/epjds/s13688-024-00460-7","DOIUrl":"https://doi.org/10.1140/epjds/s13688-024-00460-7","url":null,"abstract":"<p>Modeling human mobility helps to understand how people are accessing resources and physically contacting with each other in cities, and thus contributes to various applications such as urban planning, epidemic control, and location-based advertisement. Next location prediction is one decisive task in individual human mobility modeling and is usually viewed as sequence modeling, solved with Markov or RNN-based methods. However, the existing models paid little attention to the logic of individual travel decisions and the reproducibility of the collective behavior of population. To this end, we propose a Causal and Spatial-constrained Long and Short-term Learner (CSLSL) for next location prediction. CSLSL utilizes a causal structure based on multi-task learning to explicitly model the “<i>when</i>→<i>what</i>→<i>where</i>”, a.k.a. “<i>time</i>→<i>activity</i>→<i>location</i>” decision logic. We next propose a spatial-constrained loss function as an auxiliary task, to ensure the consistency between the predicted and actual spatial distribution of travelers’ destinations. Moreover, CSLSL adopts modules named Long and Short-term Capturer (LSC) to learn the transition regularities across different time spans. Extensive experiments on three real-world datasets show promising performance improvements of CSLSL over baselines and confirm the effectiveness of introducing the causality and consistency constraints. The implementation is available at https://github.com/urbanmobility/CSLSL.</p>","PeriodicalId":11887,"journal":{"name":"EPJ Data Science","volume":"62 1","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-03-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140170433","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-03-12DOI: 10.1140/epjds/s13688-024-00455-4
Abstract
This paper examines the phenomenon of residential segregation in Berlin over time using a dynamic clustering analysis approach. Previous research has examined the phenomenon of residential segregation in Berlin at a high spatial and temporal aggregation and statically, i.e. not over time. We propose a methodology to investigate the existence of clusters of residential areas according to migration background, age group, gender, and socio-economic dimension over time. To this end, we have developed a sequential mixed methods approach that includes a multivariate kernel density estimation technique to estimate the density of subpopulations and a dynamic cluster analysis to discover spatial patterns of residential segregation over time (2009-2020). The dynamic analysis shows the emergence of clusters on the dimensions of migration background, age group, gender and socio-economic variables. We also identified a structural change in 2015, resulting in a new cluster in Berlin that reflects the changing distribution of subpopulations with a particular migratory background. Finally, we discuss the findings of this study with previous research and suggest possibilities for policy applications and future research using a dynamic clustering approach for analyzing changes in residential segregation at the city level.
{"title":"Evolving demographics: a dynamic clustering approach to analyze residential segregation in Berlin","authors":"","doi":"10.1140/epjds/s13688-024-00455-4","DOIUrl":"https://doi.org/10.1140/epjds/s13688-024-00455-4","url":null,"abstract":"<h3>Abstract</h3> <p>This paper examines the phenomenon of residential segregation in Berlin over time using a dynamic clustering analysis approach. Previous research has examined the phenomenon of residential segregation in Berlin at a high spatial and temporal aggregation and statically, i.e. not over time. We propose a methodology to investigate the existence of clusters of residential areas according to migration background, age group, gender, and socio-economic dimension over time. To this end, we have developed a sequential mixed methods approach that includes a multivariate kernel density estimation technique to estimate the density of subpopulations and a dynamic cluster analysis to discover spatial patterns of residential segregation over time (2009-2020). The dynamic analysis shows the emergence of clusters on the dimensions of migration background, age group, gender and socio-economic variables. We also identified a structural change in 2015, resulting in a new cluster in Berlin that reflects the changing distribution of subpopulations with a particular migratory background. Finally, we discuss the findings of this study with previous research and suggest possibilities for policy applications and future research using a dynamic clustering approach for analyzing changes in residential segregation at the city level.</p>","PeriodicalId":11887,"journal":{"name":"EPJ Data Science","volume":"110 1","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140116828","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-03-08DOI: 10.1140/epjds/s13688-024-00452-7
Abstract
The same individuals can express very different emotions in online social media with respect to face-to-face interactions, partially because of intrinsic limitations of the digital environments and partially because of their algorithmic design, which is optimized to maximize engagement. Such differences become even more pronounced for topics concerning socially sensitive and polarizing issues, such as massive pharmaceutical interventions. Here, we investigate how online emotional responses change during the large-scale COVID-19 vaccination campaign with respect to a baseline in which no specific contentious topic dominates. We show that the online discussions during the pandemic generate a vast spectrum of emotional response compared to the baseline, especially when we take into account the characteristics of the users and the type of information shared in the online platform. Furthermore, we analyze the role of the political orientation of shared news, whose circulation seems to be driven not only by their actual informational content but also by the social need to strengthen one’s affiliation to, and positioning within, a specific online community by means of emotionally arousing posts. Our findings stress the importance of better understanding the emotional reactions to contentious topics at scale from digital signatures, while providing a more quantitative assessment of the ongoing online social dynamics to build a faithful picture of offline social implications.
{"title":"Large-scale digital signatures of emotional response to the COVID-19 vaccination campaign","authors":"","doi":"10.1140/epjds/s13688-024-00452-7","DOIUrl":"https://doi.org/10.1140/epjds/s13688-024-00452-7","url":null,"abstract":"<h3>Abstract</h3> <p>The same individuals can express very different emotions in online social media with respect to face-to-face interactions, partially because of intrinsic limitations of the digital environments and partially because of their algorithmic design, which is optimized to maximize engagement. Such differences become even more pronounced for topics concerning socially sensitive and polarizing issues, such as massive pharmaceutical interventions. Here, we investigate how online emotional responses change during the large-scale COVID-19 vaccination campaign with respect to a baseline in which no specific contentious topic dominates. We show that the online discussions during the pandemic generate a vast spectrum of emotional response compared to the baseline, especially when we take into account the characteristics of the users and the type of information shared in the online platform. Furthermore, we analyze the role of the political orientation of shared news, whose circulation seems to be driven not only by their actual informational content but also by the social need to strengthen one’s affiliation to, and positioning within, a specific online community by means of emotionally arousing posts. Our findings stress the importance of better understanding the emotional reactions to contentious topics at scale from digital signatures, while providing a more quantitative assessment of the ongoing online social dynamics to build a faithful picture of offline social implications.</p>","PeriodicalId":11887,"journal":{"name":"EPJ Data Science","volume":"35 1","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140070981","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-03-07DOI: 10.1140/epjds/s13688-024-00456-3
Giulio Corsi
Artificial intelligence (AI)-powered recommender systems play a crucial role in determining the content that users are exposed to on social media platforms. However, the behavioural patterns of these systems are often opaque, complicating the evaluation of their impact on the dissemination and consumption of disinformation and misinformation. To begin addressing this evidence gap, this study presents a measurement approach that uses observed digital traces to infer the status of algorithmic amplification of low-credibility content on Twitter over a 14-day period in January 2023. Using an original dataset of ≈ 2.7 million posts on COVID-19 and climate change published on the platform, this study identifies tweets sharing information from low-credibility domains, and uses a bootstrapping model with two stratifications, a tweet’s engagement level and a user’s followers level, to compare any differences in impressions generated between low-credibility and high-credibility samples. Additional stratification variables of toxicity, political bias, and verified status are also examined. This analysis provides valuable observational evidence on whether the Twitter algorithm favours the visibility of low-credibility content, with results indicating that, on aggregate, tweets containing low-credibility URL domains perform better than tweets that do not across both datasets. However, this effect is largely attributable to a difference in high-engagement, high-followers tweets, which are very impactful in terms of impressions generation, and are more likely receive amplified visibility when containing low-credibility content. Furthermore, high toxicity tweets and those with right-leaning bias see heightened amplification, as do low-credibility tweets from verified accounts. Ultimately, this suggests that Twitter’s recommender system may have facilitated the diffusion of false content by amplifying the visibility of low-credibility content with high-engagement generated by very influential users.
{"title":"Evaluating Twitter’s algorithmic amplification of low-credibility content: an observational study","authors":"Giulio Corsi","doi":"10.1140/epjds/s13688-024-00456-3","DOIUrl":"https://doi.org/10.1140/epjds/s13688-024-00456-3","url":null,"abstract":"<p>Artificial intelligence (AI)-powered recommender systems play a crucial role in determining the content that users are exposed to on social media platforms. However, the behavioural patterns of these systems are often opaque, complicating the evaluation of their impact on the dissemination and consumption of disinformation and misinformation. To begin addressing this evidence gap, this study presents a measurement approach that uses observed digital traces to infer the status of algorithmic amplification of low-credibility content on Twitter over a 14-day period in January 2023. Using an original dataset of ≈ 2.7 million posts on COVID-19 and climate change published on the platform, this study identifies tweets sharing information from low-credibility domains, and uses a bootstrapping model with two stratifications, a tweet’s engagement level and a user’s followers level, to compare any differences in impressions generated between low-credibility and high-credibility samples. Additional stratification variables of toxicity, political bias, and verified status are also examined. This analysis provides valuable observational evidence on whether the Twitter algorithm favours the visibility of low-credibility content, with results indicating that, on aggregate, tweets containing low-credibility URL domains perform better than tweets that do not across both datasets. However, this effect is largely attributable to a difference in high-engagement, high-followers tweets, which are very impactful in terms of impressions generation, and are more likely receive amplified visibility when containing low-credibility content. Furthermore, high toxicity tweets and those with right-leaning bias see heightened amplification, as do low-credibility tweets from verified accounts. Ultimately, this suggests that Twitter’s recommender system may have facilitated the diffusion of false content by amplifying the visibility of low-credibility content with high-engagement generated by very influential users.</p>","PeriodicalId":11887,"journal":{"name":"EPJ Data Science","volume":"27 1","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-03-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140054923","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-03-07DOI: 10.1140/epjds/s13688-024-00454-5
Aleksandra Urman, Ivan Smirnov, Jana Lasser
In this paper, we engage with and expand on the keynote talk about the “Right to Audit” given by Prof. Christian Sandvig at the International Conference on Computational Social Science 2021 through a critical reflection on power asymmetries in the algorithm auditing field. We elaborate on the challenges and asymmetries mentioned by Sandvig — such as those related to legal issues and the disparity between early-career and senior researchers. We also contribute a discussion of the asymmetries that were not covered by Sandvig but that we find critically important: those related to other disparities between researchers, incentive structures related to the access to data from companies, targets of auditing and users and their rights. We also discuss the implications these asymmetries have for algorithm auditing research such as the Western-centrism and the lack of the diversity of perspectives. While we focus on the field of algorithm auditing specifically, we suggest some of the discussed asymmetries affect Computational Social Science more generally and need to be reflected on and addressed.
{"title":"The right to audit and power asymmetries in algorithm auditing","authors":"Aleksandra Urman, Ivan Smirnov, Jana Lasser","doi":"10.1140/epjds/s13688-024-00454-5","DOIUrl":"https://doi.org/10.1140/epjds/s13688-024-00454-5","url":null,"abstract":"<p>In this paper, we engage with and expand on the keynote talk about the “Right to Audit” given by Prof. Christian Sandvig at the International Conference on Computational Social Science 2021 through a critical reflection on power asymmetries in the algorithm auditing field. We elaborate on the challenges and asymmetries mentioned by Sandvig — such as those related to legal issues and the disparity between early-career and senior researchers. We also contribute a discussion of the asymmetries that were not covered by Sandvig but that we find critically important: those related to other disparities between researchers, incentive structures related to the access to data from companies, targets of auditing and users and their rights. We also discuss the implications these asymmetries have for algorithm auditing research such as the Western-centrism and the lack of the diversity of perspectives. While we focus on the field of algorithm auditing specifically, we suggest some of the discussed asymmetries affect Computational Social Science more generally and need to be reflected on and addressed.</p>","PeriodicalId":11887,"journal":{"name":"EPJ Data Science","volume":"19 1","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-03-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140054924","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-03-07DOI: 10.1140/epjds/s13688-024-00458-1
Nicholas W. Landry, Jean-Gabriel Young, Nicole Eikmeier
Higher-order networks are widely used to describe complex systems in which interactions can involve more than two entities at once. In this paper, we focus on inclusion within higher-order networks, referring to situations where specific entities participate in an interaction, and subsets of those entities also interact with each other. Traditional modeling approaches to higher-order networks tend to either not consider inclusion at all (e.g., hypergraph models) or explicitly assume perfect and complete inclusion (e.g., simplicial complex models). To allow for a more nuanced assessment of inclusion in higher-order networks, we introduce the concept of “simpliciality” and several corresponding measures. Contrary to current modeling practice, we show that empirically observed systems rarely lie at either end of the simpliciality spectrum. In addition, we show that generative models fitted to these datasets struggle to capture their inclusion structure. These findings suggest new modeling directions for the field of higher-order network science.
{"title":"The simpliciality of higher-order networks","authors":"Nicholas W. Landry, Jean-Gabriel Young, Nicole Eikmeier","doi":"10.1140/epjds/s13688-024-00458-1","DOIUrl":"https://doi.org/10.1140/epjds/s13688-024-00458-1","url":null,"abstract":"<p>Higher-order networks are widely used to describe complex systems in which interactions can involve more than two entities at once. In this paper, we focus on inclusion within higher-order networks, referring to situations where specific entities participate in an interaction, and subsets of those entities also interact with each other. Traditional modeling approaches to higher-order networks tend to either not consider inclusion at all (e.g., hypergraph models) or explicitly assume perfect and complete inclusion (e.g., simplicial complex models). To allow for a more nuanced assessment of inclusion in higher-order networks, we introduce the concept of “simpliciality” and several corresponding measures. Contrary to current modeling practice, we show that empirically observed systems rarely lie at either end of the simpliciality spectrum. In addition, we show that generative models fitted to these datasets struggle to capture their inclusion structure. These findings suggest new modeling directions for the field of higher-order network science.</p>","PeriodicalId":11887,"journal":{"name":"EPJ Data Science","volume":"62 1","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-03-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140054446","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-03-05DOI: 10.1140/epjds/s13688-024-00457-2
Shijia Song, Handong Li
This study introduces a comprehensive framework grounded in recurrence analysis, a tool of nonlinear dynamics, to detect potential early warning signals (EWS) for imminent phase transitions in financial systems, with the primary goal of anticipating severe financial crashes. We first conduct a simulation experiment to demonstrate that the indicators based on multiplex recurrence networks (MRNs), namely the average mutual information and the average edge overlap, can indicate state transitions in complex systems. Subsequently, we consider the constituent stocks of the China’s and the U.S. stock markets as empirical subjects, and establish MRNs based on multidimensional returns to monitor the nonlinear dynamics of market through the corresponding the indicators and topological structures. Empirical findings indicate that the primary indicators of MRNs offer valuable insights into significant financial events or periods of extreme instability. Notably, average mutual information demonstrates promise as an effective EWS for forecasting forthcoming financial crashes. An in-depth discussion and elucidation of the theoretical underpinnings for employing indicators of MRNs as EWS, the differences in indicator effectiveness, and the possible reasons for variations in the performance of the EWS across the two markets are provided. This paper contributes to the ongoing discourse on early warning extreme market volatility, emphasizing the applicability of recurrence analysis in predicting financial crashes.
{"title":"Early warning signals for stock market crashes: empirical and analytical insights utilizing nonlinear methods","authors":"Shijia Song, Handong Li","doi":"10.1140/epjds/s13688-024-00457-2","DOIUrl":"https://doi.org/10.1140/epjds/s13688-024-00457-2","url":null,"abstract":"<p>This study introduces a comprehensive framework grounded in recurrence analysis, a tool of nonlinear dynamics, to detect potential early warning signals (EWS) for imminent phase transitions in financial systems, with the primary goal of anticipating severe financial crashes. We first conduct a simulation experiment to demonstrate that the indicators based on multiplex recurrence networks (MRNs), namely the average mutual information and the average edge overlap, can indicate state transitions in complex systems. Subsequently, we consider the constituent stocks of the China’s and the U.S. stock markets as empirical subjects, and establish MRNs based on multidimensional returns to monitor the nonlinear dynamics of market through the corresponding the indicators and topological structures. Empirical findings indicate that the primary indicators of MRNs offer valuable insights into significant financial events or periods of extreme instability. Notably, average mutual information demonstrates promise as an effective EWS for forecasting forthcoming financial crashes. An in-depth discussion and elucidation of the theoretical underpinnings for employing indicators of MRNs as EWS, the differences in indicator effectiveness, and the possible reasons for variations in the performance of the EWS across the two markets are provided. This paper contributes to the ongoing discourse on early warning extreme market volatility, emphasizing the applicability of recurrence analysis in predicting financial crashes.</p>","PeriodicalId":11887,"journal":{"name":"EPJ Data Science","volume":"11 1","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-03-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140034872","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-02-28DOI: 10.1140/epjds/s13688-024-00453-6
Wenlong Yang, Yang Wang
The prevalence of teamwork in contemporary science has raised new questions about collaboration networks and the potential impact on research outcomes. Previous studies primarily focused on pairwise interactions between scientists when constructing collaboration networks, potentially overlooking group interactions among scientists. In this study, we introduce a higher-order network representation using algebraic topology to capture multi-agent interactions, i.e., simplicial complexes. Our main objective is to investigate the influence of higher-order structures in local collaboration networks on the productivity of the focal scientist. Leveraging a dataset comprising more than 3.7 million scientists from the Microsoft Academic Graph, we uncover several intriguing findings. Firstly, we observe an inverted U-shaped relationship between the number of disconnected components in the local collaboration network and scientific productivity. Secondly, there is a positive association between the presence of higher-order loops and individual scientific productivity, indicating the intriguing role of higher-order structures in advancing science. Thirdly, these effects hold across various scientific domains and scientists with different impacts, suggesting strong generalizability of our findings. The findings highlight the role of higher-order loops in shaping the development of individual scientists, thus may have implications for nurturing scientific talent and promoting innovative breakthroughs.
团队合作在当代科学中的盛行引发了有关合作网络及其对研究成果的潜在影响的新问题。以往的研究在构建合作网络时主要关注科学家之间的配对互动,可能忽略了科学家之间的群体互动。在本研究中,我们引入了一种使用代数拓扑学的高阶网络表示法来捕捉多代理互动,即简单复合物。我们的主要目的是研究本地合作网络中的高阶结构对焦点科学家生产力的影响。利用微软学术图谱(Microsoft Academic Graph)中由 370 多万名科学家组成的数据集,我们发现了几个有趣的发现。首先,我们观察到本地协作网络中断开组件的数量与科学生产力之间存在倒 U 型关系。其次,高阶环路的存在与个人科学生产力之间存在正相关,这表明高阶结构在推动科学发展方面发挥着引人入胜的作用。第三,这些效应在不同的科学领域和具有不同影响的科学家之间都是成立的,这表明我们的发现具有很强的普适性。这些发现凸显了高阶循环在塑造科学家个体发展中的作用,从而可能对培养科学人才和促进创新突破产生影响。
{"title":"Higher-order structures of local collaboration networks are associated with individual scientific productivity","authors":"Wenlong Yang, Yang Wang","doi":"10.1140/epjds/s13688-024-00453-6","DOIUrl":"https://doi.org/10.1140/epjds/s13688-024-00453-6","url":null,"abstract":"<p>The prevalence of teamwork in contemporary science has raised new questions about collaboration networks and the potential impact on research outcomes. Previous studies primarily focused on pairwise interactions between scientists when constructing collaboration networks, potentially overlooking group interactions among scientists. In this study, we introduce a higher-order network representation using algebraic topology to capture multi-agent interactions, i.e., simplicial complexes. Our main objective is to investigate the influence of higher-order structures in local collaboration networks on the productivity of the focal scientist. Leveraging a dataset comprising more than 3.7 million scientists from the Microsoft Academic Graph, we uncover several intriguing findings. Firstly, we observe an inverted U-shaped relationship between the number of disconnected components in the local collaboration network and scientific productivity. Secondly, there is a positive association between the presence of higher-order loops and individual scientific productivity, indicating the intriguing role of higher-order structures in advancing science. Thirdly, these effects hold across various scientific domains and scientists with different impacts, suggesting strong generalizability of our findings. The findings highlight the role of higher-order loops in shaping the development of individual scientists, thus may have implications for nurturing scientific talent and promoting innovative breakthroughs.</p>","PeriodicalId":11887,"journal":{"name":"EPJ Data Science","volume":"46 1","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-02-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140006988","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}