首页 > 最新文献

EPJ Data Science最新文献

英文 中文
Evolving demographics: a dynamic clustering approach to analyze residential segregation in Berlin 不断变化的人口结构:分析柏林住宅隔离的动态聚类方法
IF 3.6 2区 计算机科学 Q1 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2024-03-12 DOI: 10.1140/epjds/s13688-024-00455-4

Abstract

This paper examines the phenomenon of residential segregation in Berlin over time using a dynamic clustering analysis approach. Previous research has examined the phenomenon of residential segregation in Berlin at a high spatial and temporal aggregation and statically, i.e. not over time. We propose a methodology to investigate the existence of clusters of residential areas according to migration background, age group, gender, and socio-economic dimension over time. To this end, we have developed a sequential mixed methods approach that includes a multivariate kernel density estimation technique to estimate the density of subpopulations and a dynamic cluster analysis to discover spatial patterns of residential segregation over time (2009-2020). The dynamic analysis shows the emergence of clusters on the dimensions of migration background, age group, gender and socio-economic variables. We also identified a structural change in 2015, resulting in a new cluster in Berlin that reflects the changing distribution of subpopulations with a particular migratory background. Finally, we discuss the findings of this study with previous research and suggest possibilities for policy applications and future research using a dynamic clustering approach for analyzing changes in residential segregation at the city level.

摘要 本文采用动态聚类分析方法研究了柏林随时间变化的居住隔离现象。以往的研究对柏林的居住隔离现象进行了高度的空间和时间聚合,并且是静态的,即不随时间变化。我们提出了一种根据移民背景、年龄组、性别和社会经济维度随时间变化研究居住区集群存在情况的方法。为此,我们开发了一种序列混合方法,其中包括一种用于估算亚人群密度的多元核密度估计技术,以及一种用于发现随时间(2009-2020 年)变化的住宅隔离空间模式的动态聚类分析。动态分析显示,在移民背景、年龄组、性别和社会经济变量等方面出现了聚类。我们还发现了 2015 年的结构性变化,在柏林形成了一个新的聚类,反映了具有特定移民背景的亚人群分布的变化。最后,我们将本研究的结果与之前的研究进行了讨论,并提出了使用动态聚类方法分析城市层面居住隔离变化的政策应用和未来研究的可能性。
{"title":"Evolving demographics: a dynamic clustering approach to analyze residential segregation in Berlin","authors":"","doi":"10.1140/epjds/s13688-024-00455-4","DOIUrl":"https://doi.org/10.1140/epjds/s13688-024-00455-4","url":null,"abstract":"<h3>Abstract</h3> <p>This paper examines the phenomenon of residential segregation in Berlin over time using a dynamic clustering analysis approach. Previous research has examined the phenomenon of residential segregation in Berlin at a high spatial and temporal aggregation and statically, i.e. not over time. We propose a methodology to investigate the existence of clusters of residential areas according to migration background, age group, gender, and socio-economic dimension over time. To this end, we have developed a sequential mixed methods approach that includes a multivariate kernel density estimation technique to estimate the density of subpopulations and a dynamic cluster analysis to discover spatial patterns of residential segregation over time (2009-2020). The dynamic analysis shows the emergence of clusters on the dimensions of migration background, age group, gender and socio-economic variables. We also identified a structural change in 2015, resulting in a new cluster in Berlin that reflects the changing distribution of subpopulations with a particular migratory background. Finally, we discuss the findings of this study with previous research and suggest possibilities for policy applications and future research using a dynamic clustering approach for analyzing changes in residential segregation at the city level.</p>","PeriodicalId":11887,"journal":{"name":"EPJ Data Science","volume":"110 1","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140116828","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Large-scale digital signatures of emotional response to the COVID-19 vaccination campaign COVID-19 疫苗接种活动情绪反应的大规模数字特征
IF 3.6 2区 计算机科学 Q1 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2024-03-08 DOI: 10.1140/epjds/s13688-024-00452-7

Abstract

The same individuals can express very different emotions in online social media with respect to face-to-face interactions, partially because of intrinsic limitations of the digital environments and partially because of their algorithmic design, which is optimized to maximize engagement. Such differences become even more pronounced for topics concerning socially sensitive and polarizing issues, such as massive pharmaceutical interventions. Here, we investigate how online emotional responses change during the large-scale COVID-19 vaccination campaign with respect to a baseline in which no specific contentious topic dominates. We show that the online discussions during the pandemic generate a vast spectrum of emotional response compared to the baseline, especially when we take into account the characteristics of the users and the type of information shared in the online platform. Furthermore, we analyze the role of the political orientation of shared news, whose circulation seems to be driven not only by their actual informational content but also by the social need to strengthen one’s affiliation to, and positioning within, a specific online community by means of emotionally arousing posts. Our findings stress the importance of better understanding the emotional reactions to contentious topics at scale from digital signatures, while providing a more quantitative assessment of the ongoing online social dynamics to build a faithful picture of offline social implications.

摘要 同样是一个人,在网络社交媒体上表达的情感与面对面交流时可能大相径庭,部分原因是数字环境的内在限制,部分原因是算法设计的优化,以最大限度地提高参与度。对于涉及社会敏感和两极分化问题的话题,如大规模的药物干预,这种差异会变得更加明显。在此,我们研究了在大规模 COVID-19 疫苗接种活动期间,相对于没有特定争议话题主导的基线,在线情绪反应是如何变化的。我们的研究表明,与基线相比,大流行病期间的在线讨论产生了广泛的情绪反应,特别是当我们考虑到用户的特点和在线平台上共享的信息类型时。此外,我们还分析了所分享新闻的政治取向所起的作用,这些新闻的传播似乎不仅受其实际信息内容的驱动,而且还受社会需求的驱动,即通过煽动情绪的帖子来加强个人对特定网络社区的归属感和定位。我们的研究结果强调了从数字签名中更好地理解对有争议话题的大规模情绪反应的重要性,同时对正在进行的在线社会动态进行了更加量化的评估,以建立对离线社会影响的忠实描述。
{"title":"Large-scale digital signatures of emotional response to the COVID-19 vaccination campaign","authors":"","doi":"10.1140/epjds/s13688-024-00452-7","DOIUrl":"https://doi.org/10.1140/epjds/s13688-024-00452-7","url":null,"abstract":"<h3>Abstract</h3> <p>The same individuals can express very different emotions in online social media with respect to face-to-face interactions, partially because of intrinsic limitations of the digital environments and partially because of their algorithmic design, which is optimized to maximize engagement. Such differences become even more pronounced for topics concerning socially sensitive and polarizing issues, such as massive pharmaceutical interventions. Here, we investigate how online emotional responses change during the large-scale COVID-19 vaccination campaign with respect to a baseline in which no specific contentious topic dominates. We show that the online discussions during the pandemic generate a vast spectrum of emotional response compared to the baseline, especially when we take into account the characteristics of the users and the type of information shared in the online platform. Furthermore, we analyze the role of the political orientation of shared news, whose circulation seems to be driven not only by their actual informational content but also by the social need to strengthen one’s affiliation to, and positioning within, a specific online community by means of emotionally arousing posts. Our findings stress the importance of better understanding the emotional reactions to contentious topics at scale from digital signatures, while providing a more quantitative assessment of the ongoing online social dynamics to build a faithful picture of offline social implications.</p>","PeriodicalId":11887,"journal":{"name":"EPJ Data Science","volume":"35 1","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140070981","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Evaluating Twitter’s algorithmic amplification of low-credibility content: an observational study 评估 Twitter 对低可信度内容的算法放大:一项观察研究
IF 3.6 2区 计算机科学 Q1 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2024-03-07 DOI: 10.1140/epjds/s13688-024-00456-3
Giulio Corsi

Artificial intelligence (AI)-powered recommender systems play a crucial role in determining the content that users are exposed to on social media platforms. However, the behavioural patterns of these systems are often opaque, complicating the evaluation of their impact on the dissemination and consumption of disinformation and misinformation. To begin addressing this evidence gap, this study presents a measurement approach that uses observed digital traces to infer the status of algorithmic amplification of low-credibility content on Twitter over a 14-day period in January 2023. Using an original dataset of ≈ 2.7 million posts on COVID-19 and climate change published on the platform, this study identifies tweets sharing information from low-credibility domains, and uses a bootstrapping model with two stratifications, a tweet’s engagement level and a user’s followers level, to compare any differences in impressions generated between low-credibility and high-credibility samples. Additional stratification variables of toxicity, political bias, and verified status are also examined. This analysis provides valuable observational evidence on whether the Twitter algorithm favours the visibility of low-credibility content, with results indicating that, on aggregate, tweets containing low-credibility URL domains perform better than tweets that do not across both datasets. However, this effect is largely attributable to a difference in high-engagement, high-followers tweets, which are very impactful in terms of impressions generation, and are more likely receive amplified visibility when containing low-credibility content. Furthermore, high toxicity tweets and those with right-leaning bias see heightened amplification, as do low-credibility tweets from verified accounts. Ultimately, this suggests that Twitter’s recommender system may have facilitated the diffusion of false content by amplifying the visibility of low-credibility content with high-engagement generated by very influential users.

人工智能(AI)驱动的推荐系统在决定用户在社交媒体平台上接触的内容方面发挥着至关重要的作用。然而,这些系统的行为模式往往是不透明的,这使得评估它们对虚假信息和错误信息的传播和消费的影响变得更加复杂。为了着手解决这一证据缺口,本研究提出了一种测量方法,利用观察到的数字痕迹来推断 2023 年 1 月 14 天内 Twitter 上低可信度内容的算法放大状况。本研究利用平台上发布的有关 COVID-19 和气候变化的≈270 万条帖子的原始数据集,识别出分享低可信度领域信息的推文,并使用具有两个分层(推文参与度和用户关注度)的引导模型,比较低可信度样本和高可信度样本之间产生的印象差异。此外,还考察了毒性、政治偏见和验证状态等其他分层变量。这项分析为推特算法是否有利于低可信度内容的可见性提供了宝贵的观察证据,结果表明,在两个数据集中,包含低可信度 URL 域的推文的总体表现要好于不包含低可信度 URL 域的推文。然而,这种效果主要归因于高参与度、高关注度推文的差异,这些推文在产生印象方面非常有影响力,当包含低可信度内容时,更有可能获得更高的可见度。此外,毒性高的推文和带有右倾偏见的推文,以及来自已验证账户的低可信度推文也会被放大。这最终表明,Twitter 的推荐系统可能通过放大由非常有影响力的用户产生的高参与度的低可信度内容的可见度,促进了虚假内容的传播。
{"title":"Evaluating Twitter’s algorithmic amplification of low-credibility content: an observational study","authors":"Giulio Corsi","doi":"10.1140/epjds/s13688-024-00456-3","DOIUrl":"https://doi.org/10.1140/epjds/s13688-024-00456-3","url":null,"abstract":"<p>Artificial intelligence (AI)-powered recommender systems play a crucial role in determining the content that users are exposed to on social media platforms. However, the behavioural patterns of these systems are often opaque, complicating the evaluation of their impact on the dissemination and consumption of disinformation and misinformation. To begin addressing this evidence gap, this study presents a measurement approach that uses observed digital traces to infer the status of algorithmic amplification of low-credibility content on Twitter over a 14-day period in January 2023. Using an original dataset of ≈ 2.7 million posts on COVID-19 and climate change published on the platform, this study identifies tweets sharing information from low-credibility domains, and uses a bootstrapping model with two stratifications, a tweet’s engagement level and a user’s followers level, to compare any differences in impressions generated between low-credibility and high-credibility samples. Additional stratification variables of toxicity, political bias, and verified status are also examined. This analysis provides valuable observational evidence on whether the Twitter algorithm favours the visibility of low-credibility content, with results indicating that, on aggregate, tweets containing low-credibility URL domains perform better than tweets that do not across both datasets. However, this effect is largely attributable to a difference in high-engagement, high-followers tweets, which are very impactful in terms of impressions generation, and are more likely receive amplified visibility when containing low-credibility content. Furthermore, high toxicity tweets and those with right-leaning bias see heightened amplification, as do low-credibility tweets from verified accounts. Ultimately, this suggests that Twitter’s recommender system may have facilitated the diffusion of false content by amplifying the visibility of low-credibility content with high-engagement generated by very influential users.</p>","PeriodicalId":11887,"journal":{"name":"EPJ Data Science","volume":"27 1","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-03-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140054923","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The right to audit and power asymmetries in algorithm auditing 审计权与算法审计中的权力不对称
IF 3.6 2区 计算机科学 Q1 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2024-03-07 DOI: 10.1140/epjds/s13688-024-00454-5
Aleksandra Urman, Ivan Smirnov, Jana Lasser

In this paper, we engage with and expand on the keynote talk about the “Right to Audit” given by Prof. Christian Sandvig at the International Conference on Computational Social Science 2021 through a critical reflection on power asymmetries in the algorithm auditing field. We elaborate on the challenges and asymmetries mentioned by Sandvig — such as those related to legal issues and the disparity between early-career and senior researchers. We also contribute a discussion of the asymmetries that were not covered by Sandvig but that we find critically important: those related to other disparities between researchers, incentive structures related to the access to data from companies, targets of auditing and users and their rights. We also discuss the implications these asymmetries have for algorithm auditing research such as the Western-centrism and the lack of the diversity of perspectives. While we focus on the field of algorithm auditing specifically, we suggest some of the discussed asymmetries affect Computational Social Science more generally and need to be reflected on and addressed.

在本文中,我们通过对算法审计领域权力不对称的批判性反思,对克里斯蒂安-桑德维希教授在 2021 年计算社会科学国际会议上发表的关于 "审计权 "的主题演讲进行了参与和扩展。我们详细阐述了桑德维希提到的挑战和不对称--例如与法律问题和早期研究人员与资深研究人员之间的差距有关的挑战和不对称。我们还对桑德维希未涉及但我们认为非常重要的不对称现象进行了讨论:研究人员之间的其他不对称现象、与获取公司数据有关的激励结构、审计目标和用户及其权利。我们还讨论了这些不对称对算法审计研究的影响,如西方中心主义和缺乏多元化视角。虽然我们关注的重点是算法审计领域,但我们认为所讨论的一些不对称现象会对计算社会科学产生更广泛的影响,需要加以反思和解决。
{"title":"The right to audit and power asymmetries in algorithm auditing","authors":"Aleksandra Urman, Ivan Smirnov, Jana Lasser","doi":"10.1140/epjds/s13688-024-00454-5","DOIUrl":"https://doi.org/10.1140/epjds/s13688-024-00454-5","url":null,"abstract":"<p>In this paper, we engage with and expand on the keynote talk about the “Right to Audit” given by Prof. Christian Sandvig at the International Conference on Computational Social Science 2021 through a critical reflection on power asymmetries in the algorithm auditing field. We elaborate on the challenges and asymmetries mentioned by Sandvig — such as those related to legal issues and the disparity between early-career and senior researchers. We also contribute a discussion of the asymmetries that were not covered by Sandvig but that we find critically important: those related to other disparities between researchers, incentive structures related to the access to data from companies, targets of auditing and users and their rights. We also discuss the implications these asymmetries have for algorithm auditing research such as the Western-centrism and the lack of the diversity of perspectives. While we focus on the field of algorithm auditing specifically, we suggest some of the discussed asymmetries affect Computational Social Science more generally and need to be reflected on and addressed.</p>","PeriodicalId":11887,"journal":{"name":"EPJ Data Science","volume":"19 1","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-03-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140054924","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The simpliciality of higher-order networks 高阶网络的简单性
IF 3.6 2区 计算机科学 Q1 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2024-03-07 DOI: 10.1140/epjds/s13688-024-00458-1
Nicholas W. Landry, Jean-Gabriel Young, Nicole Eikmeier

Higher-order networks are widely used to describe complex systems in which interactions can involve more than two entities at once. In this paper, we focus on inclusion within higher-order networks, referring to situations where specific entities participate in an interaction, and subsets of those entities also interact with each other. Traditional modeling approaches to higher-order networks tend to either not consider inclusion at all (e.g., hypergraph models) or explicitly assume perfect and complete inclusion (e.g., simplicial complex models). To allow for a more nuanced assessment of inclusion in higher-order networks, we introduce the concept of “simpliciality” and several corresponding measures. Contrary to current modeling practice, we show that empirically observed systems rarely lie at either end of the simpliciality spectrum. In addition, we show that generative models fitted to these datasets struggle to capture their inclusion structure. These findings suggest new modeling directions for the field of higher-order network science.

高阶网络被广泛用于描述复杂系统,在这些系统中,互动可能同时涉及两个以上的实体。在本文中,我们重点讨论高阶网络中的包含性,即特定实体参与互动,而这些实体的子集也相互影响的情况。传统的高阶网络建模方法倾向于完全不考虑包含性(如超图模型),或者明确假设完美和完全的包含性(如简单复合模型)。为了对高阶网络中的包含性进行更细致的评估,我们引入了 "简单性 "概念和几种相应的测量方法。与当前的建模实践相反,我们表明,经验观察到的系统很少处于简单性频谱的两端。此外,我们还表明,与这些数据集匹配的生成模型很难捕捉到它们的包含结构。这些发现为高阶网络科学领域提出了新的建模方向。
{"title":"The simpliciality of higher-order networks","authors":"Nicholas W. Landry, Jean-Gabriel Young, Nicole Eikmeier","doi":"10.1140/epjds/s13688-024-00458-1","DOIUrl":"https://doi.org/10.1140/epjds/s13688-024-00458-1","url":null,"abstract":"<p>Higher-order networks are widely used to describe complex systems in which interactions can involve more than two entities at once. In this paper, we focus on inclusion within higher-order networks, referring to situations where specific entities participate in an interaction, and subsets of those entities also interact with each other. Traditional modeling approaches to higher-order networks tend to either not consider inclusion at all (e.g., hypergraph models) or explicitly assume perfect and complete inclusion (e.g., simplicial complex models). To allow for a more nuanced assessment of inclusion in higher-order networks, we introduce the concept of “simpliciality” and several corresponding measures. Contrary to current modeling practice, we show that empirically observed systems rarely lie at either end of the simpliciality spectrum. In addition, we show that generative models fitted to these datasets struggle to capture their inclusion structure. These findings suggest new modeling directions for the field of higher-order network science.</p>","PeriodicalId":11887,"journal":{"name":"EPJ Data Science","volume":"62 1","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-03-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140054446","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Early warning signals for stock market crashes: empirical and analytical insights utilizing nonlinear methods 股市崩盘的预警信号:利用非线性方法的经验和分析见解
IF 3.6 2区 计算机科学 Q1 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2024-03-05 DOI: 10.1140/epjds/s13688-024-00457-2
Shijia Song, Handong Li

This study introduces a comprehensive framework grounded in recurrence analysis, a tool of nonlinear dynamics, to detect potential early warning signals (EWS) for imminent phase transitions in financial systems, with the primary goal of anticipating severe financial crashes. We first conduct a simulation experiment to demonstrate that the indicators based on multiplex recurrence networks (MRNs), namely the average mutual information and the average edge overlap, can indicate state transitions in complex systems. Subsequently, we consider the constituent stocks of the China’s and the U.S. stock markets as empirical subjects, and establish MRNs based on multidimensional returns to monitor the nonlinear dynamics of market through the corresponding the indicators and topological structures. Empirical findings indicate that the primary indicators of MRNs offer valuable insights into significant financial events or periods of extreme instability. Notably, average mutual information demonstrates promise as an effective EWS for forecasting forthcoming financial crashes. An in-depth discussion and elucidation of the theoretical underpinnings for employing indicators of MRNs as EWS, the differences in indicator effectiveness, and the possible reasons for variations in the performance of the EWS across the two markets are provided. This paper contributes to the ongoing discourse on early warning extreme market volatility, emphasizing the applicability of recurrence analysis in predicting financial crashes.

本研究以非线性动力学工具递推分析为基础,引入了一个综合框架,用于检测金融系统中即将发生的阶段转换的潜在预警信号(EWS),其主要目标是预测严重的金融崩溃。我们首先进行了模拟实验,证明基于多重递归网络(MRN)的指标,即平均互信息和平均边缘重叠,可以指示复杂系统的状态转换。随后,我们以中国和美国股市的成份股为实证对象,建立了基于多维收益的 MRN,通过相应的指标和拓扑结构来监测市场的非线性动态。实证研究结果表明,MRNs 的主要指标能为重大金融事件或极端不稳定时期提供有价值的洞察。值得注意的是,平均互信息有望成为预测即将发生的金融风暴的有效 EWS。本文深入探讨并阐明了采用 MRNs 指标作为预警系统的理论基础、指标有效性的差异以及预警系统在两个市场中表现不同的可能原因。本文强调了重现分析在预测金融风暴中的适用性,为当前有关市场极端波动预警的讨论做出了贡献。
{"title":"Early warning signals for stock market crashes: empirical and analytical insights utilizing nonlinear methods","authors":"Shijia Song, Handong Li","doi":"10.1140/epjds/s13688-024-00457-2","DOIUrl":"https://doi.org/10.1140/epjds/s13688-024-00457-2","url":null,"abstract":"<p>This study introduces a comprehensive framework grounded in recurrence analysis, a tool of nonlinear dynamics, to detect potential early warning signals (EWS) for imminent phase transitions in financial systems, with the primary goal of anticipating severe financial crashes. We first conduct a simulation experiment to demonstrate that the indicators based on multiplex recurrence networks (MRNs), namely the average mutual information and the average edge overlap, can indicate state transitions in complex systems. Subsequently, we consider the constituent stocks of the China’s and the U.S. stock markets as empirical subjects, and establish MRNs based on multidimensional returns to monitor the nonlinear dynamics of market through the corresponding the indicators and topological structures. Empirical findings indicate that the primary indicators of MRNs offer valuable insights into significant financial events or periods of extreme instability. Notably, average mutual information demonstrates promise as an effective EWS for forecasting forthcoming financial crashes. An in-depth discussion and elucidation of the theoretical underpinnings for employing indicators of MRNs as EWS, the differences in indicator effectiveness, and the possible reasons for variations in the performance of the EWS across the two markets are provided. This paper contributes to the ongoing discourse on early warning extreme market volatility, emphasizing the applicability of recurrence analysis in predicting financial crashes.</p>","PeriodicalId":11887,"journal":{"name":"EPJ Data Science","volume":"11 1","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-03-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140034872","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Higher-order structures of local collaboration networks are associated with individual scientific productivity 地方合作网络的高阶结构与个人科学生产力相关联
IF 3.6 2区 计算机科学 Q1 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2024-02-28 DOI: 10.1140/epjds/s13688-024-00453-6
Wenlong Yang, Yang Wang

The prevalence of teamwork in contemporary science has raised new questions about collaboration networks and the potential impact on research outcomes. Previous studies primarily focused on pairwise interactions between scientists when constructing collaboration networks, potentially overlooking group interactions among scientists. In this study, we introduce a higher-order network representation using algebraic topology to capture multi-agent interactions, i.e., simplicial complexes. Our main objective is to investigate the influence of higher-order structures in local collaboration networks on the productivity of the focal scientist. Leveraging a dataset comprising more than 3.7 million scientists from the Microsoft Academic Graph, we uncover several intriguing findings. Firstly, we observe an inverted U-shaped relationship between the number of disconnected components in the local collaboration network and scientific productivity. Secondly, there is a positive association between the presence of higher-order loops and individual scientific productivity, indicating the intriguing role of higher-order structures in advancing science. Thirdly, these effects hold across various scientific domains and scientists with different impacts, suggesting strong generalizability of our findings. The findings highlight the role of higher-order loops in shaping the development of individual scientists, thus may have implications for nurturing scientific talent and promoting innovative breakthroughs.

团队合作在当代科学中的盛行引发了有关合作网络及其对研究成果的潜在影响的新问题。以往的研究在构建合作网络时主要关注科学家之间的配对互动,可能忽略了科学家之间的群体互动。在本研究中,我们引入了一种使用代数拓扑学的高阶网络表示法来捕捉多代理互动,即简单复合物。我们的主要目的是研究本地合作网络中的高阶结构对焦点科学家生产力的影响。利用微软学术图谱(Microsoft Academic Graph)中由 370 多万名科学家组成的数据集,我们发现了几个有趣的发现。首先,我们观察到本地协作网络中断开组件的数量与科学生产力之间存在倒 U 型关系。其次,高阶环路的存在与个人科学生产力之间存在正相关,这表明高阶结构在推动科学发展方面发挥着引人入胜的作用。第三,这些效应在不同的科学领域和具有不同影响的科学家之间都是成立的,这表明我们的发现具有很强的普适性。这些发现凸显了高阶循环在塑造科学家个体发展中的作用,从而可能对培养科学人才和促进创新突破产生影响。
{"title":"Higher-order structures of local collaboration networks are associated with individual scientific productivity","authors":"Wenlong Yang, Yang Wang","doi":"10.1140/epjds/s13688-024-00453-6","DOIUrl":"https://doi.org/10.1140/epjds/s13688-024-00453-6","url":null,"abstract":"<p>The prevalence of teamwork in contemporary science has raised new questions about collaboration networks and the potential impact on research outcomes. Previous studies primarily focused on pairwise interactions between scientists when constructing collaboration networks, potentially overlooking group interactions among scientists. In this study, we introduce a higher-order network representation using algebraic topology to capture multi-agent interactions, i.e., simplicial complexes. Our main objective is to investigate the influence of higher-order structures in local collaboration networks on the productivity of the focal scientist. Leveraging a dataset comprising more than 3.7 million scientists from the Microsoft Academic Graph, we uncover several intriguing findings. Firstly, we observe an inverted U-shaped relationship between the number of disconnected components in the local collaboration network and scientific productivity. Secondly, there is a positive association between the presence of higher-order loops and individual scientific productivity, indicating the intriguing role of higher-order structures in advancing science. Thirdly, these effects hold across various scientific domains and scientists with different impacts, suggesting strong generalizability of our findings. The findings highlight the role of higher-order loops in shaping the development of individual scientists, thus may have implications for nurturing scientific talent and promoting innovative breakthroughs.</p>","PeriodicalId":11887,"journal":{"name":"EPJ Data Science","volume":"46 1","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-02-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140006988","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Critical computational social science 批判性计算社会科学
IF 3.6 2区 计算机科学 Q1 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2024-02-26 DOI: 10.1140/epjds/s13688-023-00433-2
Sarah Shugars

In her 2021 IC2S2 keynote talk, “Critical Data Theory,” Margaret Hu builds off Critical Race Theory, privacy law, and big data surveillance to grapple with questions at the intersection of big data and legal jurisprudence. As a legal scholar, Hu’s work focuses primarily on issues of governance and regulation—examining the legal and constitutional impact of modern data collection and analysis. Yet, her call for Critical Data Theory has important implications for the field of Computational Social Science (CSS) as a whole. In this article, I therefore reflect on Hu’s conception of Critical Data Theory and its broader implications for CSS research. Specifically, I’ll consider the ramifications of her work for the scientific community—exploring how we as researchers should think about the ethics and realities of the data which forms the foundations of our work.

在 2021 年 IC2S2 的主题演讲 "批判数据理论 "中,Margaret Hu 以批判种族理论、隐私法和大数据监控为基础,探讨了大数据与法律法学的交叉问题。作为一名法律学者,胡玛格丽特的工作主要集中在治理和监管问题上--探讨现代数据收集和分析对法律和宪法的影响。然而,她对批判数据理论的呼吁对整个计算社会科学(CSS)领域有着重要的影响。因此,在本文中,我将对胡女士的批判数据理论概念及其对 CSS 研究的广泛影响进行反思。具体来说,我将考虑她的工作对科学界的影响--探讨作为研究人员,我们应该如何思考构成我们工作基础的数据的伦理和现实。
{"title":"Critical computational social science","authors":"Sarah Shugars","doi":"10.1140/epjds/s13688-023-00433-2","DOIUrl":"https://doi.org/10.1140/epjds/s13688-023-00433-2","url":null,"abstract":"<p>In her 2021 IC2S2 keynote talk, “Critical Data Theory,” Margaret Hu builds off Critical Race Theory, privacy law, and big data surveillance to grapple with questions at the intersection of big data and legal jurisprudence. As a legal scholar, Hu’s work focuses primarily on issues of governance and regulation—examining the legal and constitutional impact of modern data collection and analysis. Yet, her call for Critical Data Theory has important implications for the field of Computational Social Science (CSS) as a whole. In this article, I therefore reflect on Hu’s conception of Critical Data Theory and its broader implications for CSS research. Specifically, I’ll consider the ramifications of her work for the scientific community—exploring how we as researchers should think about the ethics and realities of the data which forms the foundations of our work.</p>","PeriodicalId":11887,"journal":{"name":"EPJ Data Science","volume":"57 1","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139980944","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Thinking spatially in computational social science 计算社会科学中的空间思维
IF 3.6 2区 计算机科学 Q1 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2024-02-26 DOI: 10.1140/epjds/s13688-023-00443-0

Abstract

Deductive and theory-driven research starts by asking questions. Finding tentative answers to these questions in the literature is next. It is followed by gathering, preparing and modelling relevant data to empirically test these tentative answers. Inductive research, on the other hand, starts with data representation and finding general patterns in data. Ahn suggested, in his keynote speech at the seventh International Conference on Computational Social Science (IC2S2) 2021, that the way this data is represented could shape our understanding and the type of answers we find for the questions. He discussed that specific representation learning approaches enable a meaningful embedding space and could allow spatial thinking and broaden computational imagination. In this commentary, I summarize Ahn’s keynote and related publications, provide an overview of the use of spatial metaphor in sociology, discuss how such representation learning can help both inductive and deductive research, propose future avenues of research that could benefit from spatial thinking, and pose some still open questions.

摘要 演绎式和理论驱动式研究首先要提出问题。然后在文献中找到这些问题的初步答案。然后是收集、准备相关数据并建立模型,以便对这些暂定答案进行实证检验。另一方面,归纳式研究则从数据表示和发现数据中的一般模式开始。Ahn 在 2021 年第七届计算社会科学国际会议(IC2S2)上发表主旨演讲时指出,数据表示的方式会影响我们对问题的理解和找到的答案类型。他讨论说,特定的表征学习方法可以实现有意义的嵌入空间,并允许进行空间思考和拓宽计算想象力。在这篇评论中,我总结了安氏的主题演讲和相关出版物,概述了空间隐喻在社会学中的应用,讨论了这种表征学习如何有助于归纳和演绎研究,提出了可受益于空间思维的未来研究途径,并提出了一些仍未解决的问题。
{"title":"Thinking spatially in computational social science","authors":"","doi":"10.1140/epjds/s13688-023-00443-0","DOIUrl":"https://doi.org/10.1140/epjds/s13688-023-00443-0","url":null,"abstract":"<h3>Abstract</h3> <p>Deductive and theory-driven research starts by asking questions. Finding tentative answers to these questions in the literature is next. It is followed by gathering, preparing and modelling relevant data to empirically test these tentative answers. Inductive research, on the other hand, starts with data representation and finding general patterns in data. Ahn suggested, in his keynote speech at the seventh International Conference on Computational Social Science (IC<sup>2</sup>S<sup>2</sup>) 2021, that the way this data is represented could shape our understanding and the type of answers we find for the questions. He discussed that specific representation learning approaches enable a meaningful embedding space and could allow spatial thinking and broaden computational imagination. In this commentary, I summarize Ahn’s keynote and related publications, provide an overview of the use of spatial metaphor in sociology, discuss how such representation learning can help both inductive and deductive research, propose future avenues of research that could benefit from spatial thinking, and pose some still open questions.</p>","PeriodicalId":11887,"journal":{"name":"EPJ Data Science","volume":"22 1","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139981035","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Charting mobility patterns in the scientific knowledge landscape 描绘科学知识领域的流动模式
IF 3.6 2区 计算机科学 Q1 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2024-02-20 DOI: 10.1140/epjds/s13688-024-00451-8

Abstract

From small steps to great leaps, metaphors of spatial mobility abound to describe discovery processes. Here, we ground these ideas in formal terms by systematically studying mobility patterns in the scientific knowledge landscape. We use low-dimensional embedding techniques to create a knowledge space made up of 1.5 million articles from the fields of physics, computer science, and mathematics. By analyzing the publication histories of individual researchers, we discover patterns of scientific mobility that closely resemble physical mobility. In aggregate, the trajectories form mobility flows that can be described by a gravity model, with jumps more likely to occur in areas of high density and less likely to occur over longer distances. We identify two types of researchers from their individual mobility patterns: interdisciplinary explorers who pioneer new fields, and exploiters who are more likely to stay within their specific areas of expertise. Our results suggest that spatial mobility analysis is a valuable tool for understanding the evolution of science.

摘要 从小步到大跃进,描述发现过程的空间流动隐喻比比皆是。在这里,我们通过系统地研究科学知识景观中的流动模式,将这些观点用形式化的术语加以表述。我们使用低维嵌入技术创建了一个由物理学、计算机科学和数学领域的 150 万篇文章组成的知识空间。通过分析单个研究人员的发表历史,我们发现了与物理流动密切相关的科学流动模式。从总体上看,这些轨迹形成了可以用重力模型描述的流动流,在高密度地区发生跳跃的可能性更大,而在较远距离上发生跳跃的可能性较小。我们从研究人员的个人流动模式中发现了两种类型的研究人员:开拓新领域的跨学科探索者和更倾向于留在其特定专业领域的开发者。我们的研究结果表明,空间流动性分析是了解科学演变的重要工具。
{"title":"Charting mobility patterns in the scientific knowledge landscape","authors":"","doi":"10.1140/epjds/s13688-024-00451-8","DOIUrl":"https://doi.org/10.1140/epjds/s13688-024-00451-8","url":null,"abstract":"<h3>Abstract</h3> <p>From small steps to great leaps, metaphors of spatial mobility abound to describe discovery processes. Here, we ground these ideas in formal terms by systematically studying mobility patterns in the scientific knowledge landscape. We use low-dimensional embedding techniques to create a knowledge space made up of 1.5 million articles from the fields of physics, computer science, and mathematics. By analyzing the publication histories of individual researchers, we discover patterns of scientific mobility that closely resemble physical mobility. In aggregate, the trajectories form mobility flows that can be described by a gravity model, with jumps more likely to occur in areas of high density and less likely to occur over longer distances. We identify two types of researchers from their individual mobility patterns: interdisciplinary <em>explorers</em> who pioneer new fields, and <em>exploiters</em> who are more likely to stay within their specific areas of expertise. Our results suggest that spatial mobility analysis is a valuable tool for understanding the evolution of science.</p>","PeriodicalId":11887,"journal":{"name":"EPJ Data Science","volume":"17 1","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139925078","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
EPJ Data Science
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1