首页 > 最新文献

EPJ Data Science最新文献

英文 中文
Critical computational social science 批判性计算社会科学
IF 3.6 2区 计算机科学 Q1 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2024-02-26 DOI: 10.1140/epjds/s13688-023-00433-2
Sarah Shugars

In her 2021 IC2S2 keynote talk, “Critical Data Theory,” Margaret Hu builds off Critical Race Theory, privacy law, and big data surveillance to grapple with questions at the intersection of big data and legal jurisprudence. As a legal scholar, Hu’s work focuses primarily on issues of governance and regulation—examining the legal and constitutional impact of modern data collection and analysis. Yet, her call for Critical Data Theory has important implications for the field of Computational Social Science (CSS) as a whole. In this article, I therefore reflect on Hu’s conception of Critical Data Theory and its broader implications for CSS research. Specifically, I’ll consider the ramifications of her work for the scientific community—exploring how we as researchers should think about the ethics and realities of the data which forms the foundations of our work.

在 2021 年 IC2S2 的主题演讲 "批判数据理论 "中,Margaret Hu 以批判种族理论、隐私法和大数据监控为基础,探讨了大数据与法律法学的交叉问题。作为一名法律学者,胡玛格丽特的工作主要集中在治理和监管问题上--探讨现代数据收集和分析对法律和宪法的影响。然而,她对批判数据理论的呼吁对整个计算社会科学(CSS)领域有着重要的影响。因此,在本文中,我将对胡女士的批判数据理论概念及其对 CSS 研究的广泛影响进行反思。具体来说,我将考虑她的工作对科学界的影响--探讨作为研究人员,我们应该如何思考构成我们工作基础的数据的伦理和现实。
{"title":"Critical computational social science","authors":"Sarah Shugars","doi":"10.1140/epjds/s13688-023-00433-2","DOIUrl":"https://doi.org/10.1140/epjds/s13688-023-00433-2","url":null,"abstract":"<p>In her 2021 IC2S2 keynote talk, “Critical Data Theory,” Margaret Hu builds off Critical Race Theory, privacy law, and big data surveillance to grapple with questions at the intersection of big data and legal jurisprudence. As a legal scholar, Hu’s work focuses primarily on issues of governance and regulation—examining the legal and constitutional impact of modern data collection and analysis. Yet, her call for Critical Data Theory has important implications for the field of Computational Social Science (CSS) as a whole. In this article, I therefore reflect on Hu’s conception of Critical Data Theory and its broader implications for CSS research. Specifically, I’ll consider the ramifications of her work for the scientific community—exploring how we as researchers should think about the ethics and realities of the data which forms the foundations of our work.</p>","PeriodicalId":11887,"journal":{"name":"EPJ Data Science","volume":"57 1","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139980944","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Thinking spatially in computational social science 计算社会科学中的空间思维
IF 3.6 2区 计算机科学 Q1 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2024-02-26 DOI: 10.1140/epjds/s13688-023-00443-0

Abstract

Deductive and theory-driven research starts by asking questions. Finding tentative answers to these questions in the literature is next. It is followed by gathering, preparing and modelling relevant data to empirically test these tentative answers. Inductive research, on the other hand, starts with data representation and finding general patterns in data. Ahn suggested, in his keynote speech at the seventh International Conference on Computational Social Science (IC2S2) 2021, that the way this data is represented could shape our understanding and the type of answers we find for the questions. He discussed that specific representation learning approaches enable a meaningful embedding space and could allow spatial thinking and broaden computational imagination. In this commentary, I summarize Ahn’s keynote and related publications, provide an overview of the use of spatial metaphor in sociology, discuss how such representation learning can help both inductive and deductive research, propose future avenues of research that could benefit from spatial thinking, and pose some still open questions.

摘要 演绎式和理论驱动式研究首先要提出问题。然后在文献中找到这些问题的初步答案。然后是收集、准备相关数据并建立模型,以便对这些暂定答案进行实证检验。另一方面,归纳式研究则从数据表示和发现数据中的一般模式开始。Ahn 在 2021 年第七届计算社会科学国际会议(IC2S2)上发表主旨演讲时指出,数据表示的方式会影响我们对问题的理解和找到的答案类型。他讨论说,特定的表征学习方法可以实现有意义的嵌入空间,并允许进行空间思考和拓宽计算想象力。在这篇评论中,我总结了安氏的主题演讲和相关出版物,概述了空间隐喻在社会学中的应用,讨论了这种表征学习如何有助于归纳和演绎研究,提出了可受益于空间思维的未来研究途径,并提出了一些仍未解决的问题。
{"title":"Thinking spatially in computational social science","authors":"","doi":"10.1140/epjds/s13688-023-00443-0","DOIUrl":"https://doi.org/10.1140/epjds/s13688-023-00443-0","url":null,"abstract":"<h3>Abstract</h3> <p>Deductive and theory-driven research starts by asking questions. Finding tentative answers to these questions in the literature is next. It is followed by gathering, preparing and modelling relevant data to empirically test these tentative answers. Inductive research, on the other hand, starts with data representation and finding general patterns in data. Ahn suggested, in his keynote speech at the seventh International Conference on Computational Social Science (IC<sup>2</sup>S<sup>2</sup>) 2021, that the way this data is represented could shape our understanding and the type of answers we find for the questions. He discussed that specific representation learning approaches enable a meaningful embedding space and could allow spatial thinking and broaden computational imagination. In this commentary, I summarize Ahn’s keynote and related publications, provide an overview of the use of spatial metaphor in sociology, discuss how such representation learning can help both inductive and deductive research, propose future avenues of research that could benefit from spatial thinking, and pose some still open questions.</p>","PeriodicalId":11887,"journal":{"name":"EPJ Data Science","volume":"22 1","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139981035","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Charting mobility patterns in the scientific knowledge landscape 描绘科学知识领域的流动模式
IF 3.6 2区 计算机科学 Q1 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2024-02-20 DOI: 10.1140/epjds/s13688-024-00451-8

Abstract

From small steps to great leaps, metaphors of spatial mobility abound to describe discovery processes. Here, we ground these ideas in formal terms by systematically studying mobility patterns in the scientific knowledge landscape. We use low-dimensional embedding techniques to create a knowledge space made up of 1.5 million articles from the fields of physics, computer science, and mathematics. By analyzing the publication histories of individual researchers, we discover patterns of scientific mobility that closely resemble physical mobility. In aggregate, the trajectories form mobility flows that can be described by a gravity model, with jumps more likely to occur in areas of high density and less likely to occur over longer distances. We identify two types of researchers from their individual mobility patterns: interdisciplinary explorers who pioneer new fields, and exploiters who are more likely to stay within their specific areas of expertise. Our results suggest that spatial mobility analysis is a valuable tool for understanding the evolution of science.

摘要 从小步到大跃进,描述发现过程的空间流动隐喻比比皆是。在这里,我们通过系统地研究科学知识景观中的流动模式,将这些观点用形式化的术语加以表述。我们使用低维嵌入技术创建了一个由物理学、计算机科学和数学领域的 150 万篇文章组成的知识空间。通过分析单个研究人员的发表历史,我们发现了与物理流动密切相关的科学流动模式。从总体上看,这些轨迹形成了可以用重力模型描述的流动流,在高密度地区发生跳跃的可能性更大,而在较远距离上发生跳跃的可能性较小。我们从研究人员的个人流动模式中发现了两种类型的研究人员:开拓新领域的跨学科探索者和更倾向于留在其特定专业领域的开发者。我们的研究结果表明,空间流动性分析是了解科学演变的重要工具。
{"title":"Charting mobility patterns in the scientific knowledge landscape","authors":"","doi":"10.1140/epjds/s13688-024-00451-8","DOIUrl":"https://doi.org/10.1140/epjds/s13688-024-00451-8","url":null,"abstract":"<h3>Abstract</h3> <p>From small steps to great leaps, metaphors of spatial mobility abound to describe discovery processes. Here, we ground these ideas in formal terms by systematically studying mobility patterns in the scientific knowledge landscape. We use low-dimensional embedding techniques to create a knowledge space made up of 1.5 million articles from the fields of physics, computer science, and mathematics. By analyzing the publication histories of individual researchers, we discover patterns of scientific mobility that closely resemble physical mobility. In aggregate, the trajectories form mobility flows that can be described by a gravity model, with jumps more likely to occur in areas of high density and less likely to occur over longer distances. We identify two types of researchers from their individual mobility patterns: interdisciplinary <em>explorers</em> who pioneer new fields, and <em>exploiters</em> who are more likely to stay within their specific areas of expertise. Our results suggest that spatial mobility analysis is a valuable tool for understanding the evolution of science.</p>","PeriodicalId":11887,"journal":{"name":"EPJ Data Science","volume":"17 1","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139925078","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Cryptocurrency co-investment network: token returns reflect investment patterns 加密货币联合投资网络:代币回报反映投资模式
IF 3.6 2区 计算机科学 Q1 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2024-02-20 DOI: 10.1140/epjds/s13688-023-00446-x
Luca Mungo, Silvia Bartolucci, Laura Alessandretti

Since the introduction of Bitcoin in 2009, the dramatic and unsteady evolution of the cryptocurrency market has also been driven by large investments by traditional and cryptocurrency-focused hedge funds. Notwithstanding their critical role, our understanding of the relationship between institutional investments and the evolution of the cryptocurrency market has remained limited, also due to the lack of comprehensive data describing investments over time. In this study, we present a quantitative study of cryptocurrency institutional investments based on a dataset collected for 1324 currencies in the period between 2014 and 2022 from Crunchbase, one of the largest platforms gathering business information. We show that the evolution of the cryptocurrency market capitalization is highly correlated with the size of institutional investments, thus confirming their important role. Further, we find that the market is dominated by the presence of a group of prominent investors who tend to specialise by focusing on particular technologies. Finally, studying the co-investment network of currencies that share common investors, we show that assets with shared investors tend to be characterized by similar market behaviour. Our work sheds light on the role played by institutional investors and provides a basis for further research on their influence in the cryptocurrency ecosystem.

自 2009 年比特币问世以来,传统对冲基金和以加密货币为重点的对冲基金的大量投资也推动了加密货币市场剧烈而不稳定的演变。尽管对冲基金发挥了关键作用,但我们对机构投资与加密货币市场演变之间关系的了解仍然有限,这也是由于缺乏描述长期投资的全面数据。在本研究中,我们基于从最大的商业信息收集平台之一 Crunchbase 收集到的 2014 年至 2022 年期间 1324 种货币的数据集,对加密货币机构投资进行了定量研究。我们发现,加密货币市值的变化与机构投资规模高度相关,从而证实了机构投资的重要作用。此外,我们还发现,该市场由一群知名投资者主导,这些投资者往往专注于特定技术。最后,通过研究共享投资者的货币的共同投资网络,我们发现共享投资者的资产往往具有相似的市场行为特征。我们的研究揭示了机构投资者所扮演的角色,为进一步研究他们在加密货币生态系统中的影响力提供了基础。
{"title":"Cryptocurrency co-investment network: token returns reflect investment patterns","authors":"Luca Mungo, Silvia Bartolucci, Laura Alessandretti","doi":"10.1140/epjds/s13688-023-00446-x","DOIUrl":"https://doi.org/10.1140/epjds/s13688-023-00446-x","url":null,"abstract":"<p>Since the introduction of Bitcoin in 2009, the dramatic and unsteady evolution of the cryptocurrency market has also been driven by large investments by traditional and cryptocurrency-focused hedge funds. Notwithstanding their critical role, our understanding of the relationship between institutional investments and the evolution of the cryptocurrency market has remained limited, also due to the lack of comprehensive data describing investments over time. In this study, we present a quantitative study of cryptocurrency institutional investments based on a dataset collected for 1324 currencies in the period between 2014 and 2022 from Crunchbase, one of the largest platforms gathering business information. We show that the evolution of the cryptocurrency market capitalization is highly correlated with the size of institutional investments, thus confirming their important role. Further, we find that the market is dominated by the presence of a group of prominent investors who tend to specialise by focusing on particular technologies. Finally, studying the co-investment network of currencies that share common investors, we show that assets with shared investors tend to be characterized by similar market behaviour. Our work sheds light on the role played by institutional investors and provides a basis for further research on their influence in the cryptocurrency ecosystem.</p>","PeriodicalId":11887,"journal":{"name":"EPJ Data Science","volume":"23 1","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139925405","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Identifying the systemic importance and systemic vulnerability of financial institutions based on portfolio similarity correlation network 基于投资组合相似性相关网络识别金融机构的系统重要性和系统脆弱性
IF 3.6 2区 计算机科学 Q1 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2024-01-31 DOI: 10.1140/epjds/s13688-024-00449-2
Manjin Shao, Hong Fan

The indirect correlation among financial institutions, stemming from similarities in their portfolios, is a primary driver of systemic risk. However, most existing research overlooks the influence of portfolio similarity among various types of financial institutions on this risk. Therefore, we construct the network of portfolio similarity correlations among different types of financial institutions, based on measurements of portfolio similarity. Utilizing the expanded fire sale contagion model, we offer a comprehensive assessment of systemic risk for Chinese financial institutions. Initially, we introduce indicators for systemic risk, systemic importance, and systemic vulnerability. Subsequently, we examine the cross-sectional and time-series characteristics of these institutions’ systemic importance and vulnerability within the context of the portfolio similarity correlation network. Our empirical findings reveal a high degree of portfolio similarity between banks and insurance companies, contrasted with lower similarity between banks and securities firms. Moreover, when considering the portfolio similarity correlation network, both the systemic importance and vulnerability of Chinese banks and insurance companies surpass those of securities firms in both cross-sectional and temporal dimensions. Notably, our analysis further illustrates that a financial institution’s systemic importance and vulnerability are strongly and positively associated with the magnitude of portfolio similarity between that institution and others.

金融机构之间的间接相关性源于其投资组合的相似性,是系统性风险的主要驱动因素。然而,现有研究大多忽视了各类金融机构之间投资组合相似性对这一风险的影响。因此,我们根据投资组合相似性的测量结果,构建了不同类型金融机构之间的投资组合相似性相关网络。利用扩展的火灾销售传染模型,我们对中国金融机构的系统性风险进行了全面评估。首先,我们介绍了系统性风险、系统重要性和系统脆弱性指标。随后,我们考察了这些机构在投资组合相似性相关网络背景下的系统重要性和脆弱性的横截面和时序特征。我们的实证研究结果表明,银行与保险公司之间的投资组合相似度较高,而银行与证券公司之间的相似度较低。此外,在考虑投资组合相似性相关网络时,中国银行和保险公司的系统重要性和脆弱性在横截面和时间维度上都超过了证券公司。值得注意的是,我们的分析进一步说明,金融机构的系统重要性和脆弱性与该机构与其他机构之间的投资组合相似性大小密切正相关。
{"title":"Identifying the systemic importance and systemic vulnerability of financial institutions based on portfolio similarity correlation network","authors":"Manjin Shao, Hong Fan","doi":"10.1140/epjds/s13688-024-00449-2","DOIUrl":"https://doi.org/10.1140/epjds/s13688-024-00449-2","url":null,"abstract":"<p>The indirect correlation among financial institutions, stemming from similarities in their portfolios, is a primary driver of systemic risk. However, most existing research overlooks the influence of portfolio similarity among various types of financial institutions on this risk. Therefore, we construct the network of portfolio similarity correlations among different types of financial institutions, based on measurements of portfolio similarity. Utilizing the expanded fire sale contagion model, we offer a comprehensive assessment of systemic risk for Chinese financial institutions. Initially, we introduce indicators for systemic risk, systemic importance, and systemic vulnerability. Subsequently, we examine the cross-sectional and time-series characteristics of these institutions’ systemic importance and vulnerability within the context of the portfolio similarity correlation network. Our empirical findings reveal a high degree of portfolio similarity between banks and insurance companies, contrasted with lower similarity between banks and securities firms. Moreover, when considering the portfolio similarity correlation network, both the systemic importance and vulnerability of Chinese banks and insurance companies surpass those of securities firms in both cross-sectional and temporal dimensions. Notably, our analysis further illustrates that a financial institution’s systemic importance and vulnerability are strongly and positively associated with the magnitude of portfolio similarity between that institution and others.</p>","PeriodicalId":11887,"journal":{"name":"EPJ Data Science","volume":"2 1","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139645070","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Account credibility inference based on news-sharing networks 基于新闻共享网络的账户可信度推断
IF 3.6 2区 计算机科学 Q1 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2024-01-31 DOI: 10.1140/epjds/s13688-024-00450-9
Bao Tran Truong, Oliver Melbourne Allen, Filippo Menczer

The spread of misinformation poses a threat to the social media ecosystem. Effective countermeasures to mitigate this threat require that social media platforms be able to accurately detect low-credibility accounts even before the content they share can be classified as misinformation. Here we present methods to infer account credibility from information diffusion patterns, in particular leveraging two networks: the reshare network, capturing an account’s trust in other accounts, and the bipartite account-source network, capturing an account’s trust in media sources. We extend network centrality measures and graph embedding techniques, systematically comparing these algorithms on data from diverse contexts and social media platforms. We demonstrate that both kinds of trust networks provide useful signals for estimating account credibility. Some of the proposed methods yield high accuracy, providing promising solutions to promote the dissemination of reliable information in online communities. Two kinds of homophily emerge from our results: accounts tend to have similar credibility if they reshare each other’s content or share content from similar sources. Our methodology invites further investigation into the relationship between accounts and news sources to better characterize misinformation spreaders.

错误信息的传播对社交媒体生态系统构成威胁。要想采取有效的应对措施来减轻这种威胁,社交媒体平台就必须在低可信度账户分享的内容被归类为错误信息之前就能准确地检测到它们。在此,我们提出了从信息传播模式推断账户可信度的方法,特别是利用两个网络:再分享网络(捕捉账户对其他账户的信任)和二元账户-来源网络(捕捉账户对媒体来源的信任)。我们扩展了网络中心性度量和图嵌入技术,在来自不同环境和社交媒体平台的数据上系统地比较了这些算法。我们证明,这两种信任网络都能为估计账户可信度提供有用的信号。所提出的一些方法具有很高的准确性,为促进可靠信息在网络社区中的传播提供了有前途的解决方案。我们的研究结果表明了两种同亲关系:如果账户之间相互分享内容或分享来自相似来源的内容,那么这些账户往往具有相似的可信度。我们的方法有助于进一步研究账户与新闻来源之间的关系,从而更好地描述错误信息传播者的特征。
{"title":"Account credibility inference based on news-sharing networks","authors":"Bao Tran Truong, Oliver Melbourne Allen, Filippo Menczer","doi":"10.1140/epjds/s13688-024-00450-9","DOIUrl":"https://doi.org/10.1140/epjds/s13688-024-00450-9","url":null,"abstract":"<p>The spread of misinformation poses a threat to the social media ecosystem. Effective countermeasures to mitigate this threat require that social media platforms be able to accurately detect low-credibility accounts even before the content they share can be classified as misinformation. Here we present methods to infer account credibility from information diffusion patterns, in particular leveraging two networks: the reshare network, capturing an account’s trust in other accounts, and the bipartite account-source network, capturing an account’s trust in media sources. We extend network centrality measures and graph embedding techniques, systematically comparing these algorithms on data from diverse contexts and social media platforms. We demonstrate that both kinds of trust networks provide useful signals for estimating account credibility. Some of the proposed methods yield high accuracy, providing promising solutions to promote the dissemination of reliable information in online communities. Two kinds of homophily emerge from our results: accounts tend to have similar credibility if they reshare each other’s content or share content from similar sources. Our methodology invites further investigation into the relationship between accounts and news sources to better characterize misinformation spreaders.</p>","PeriodicalId":11887,"journal":{"name":"EPJ Data Science","volume":"23 1","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139645231","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Which sport is becoming more predictable? A cross-discipline analysis of predictability in team sports 哪种运动的可预测性更高?团队运动可预测性的跨学科分析
IF 3.6 2区 计算机科学 Q1 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2024-01-29 DOI: 10.1140/epjds/s13688-024-00448-3
Michele Coscia

Professional sports are a cultural activity beloved by many, and a global hundred-billion-dollar industry. In this paper, we investigate the trends of match outcome predictability, assuming that the public is more interested in an event if there is some uncertainty about who will win. We reproduce previous methodology focused on soccer and we expand it by analyzing more than 300,000 matches in the 1996-2023 period from nine disciplines, to identify which disciplines are getting more/less predictable over time. We investigate the home advantage effect, since it can affect outcome predictability and it has been impacted by the COVID-19 pandemic. Going beyond previous work, we estimate which sport management model – between the egalitarian one popular in North America and the rich-get-richer used in Europe – leads to more uncertain outcomes. Our results show that there is no generalized trend in predictability across sport disciplines, that home advantage has been decreasing independently from the pandemic, and that sports managed with the egalitarian North American approach tend to be less predictable. We base our result on a predictive model that ranks team by analyzing the directed network of who-beats-whom, where the most central teams in the network are expected to be the best performing ones. Our results are robust to the measure we use for the prediction.

职业体育是一项深受大众喜爱的文化活动,也是一项价值千亿美元的全球性产业。在本文中,我们研究了比赛结果可预测性的趋势,假定如果谁会获胜存在一定的不确定性,那么公众就会对某项赛事更感兴趣。我们重现了之前以足球为重点的研究方法,并对其进行了扩展,分析了 1996-2023 年间九个领域的 30 多万场比赛,以确定哪些领域的比赛随着时间的推移变得更可预测/更不可预测。我们研究了主场优势效应,因为它可以影响结果的可预测性,而且受到 COVID-19 大流行病的影响。在以往工作的基础上,我们估算了在北美流行的平均主义模式和欧洲使用的富者越富模式之间,哪种体育管理模式会导致更不确定的结果。我们的研究结果表明,各体育项目的可预测性并没有普遍的趋势,主场优势的减少与大流行病无关,而采用北美平均主义方式管理的体育项目往往更难预测。我们的结果基于一个预测模型,该模型通过分析 "谁击败谁 "的有向网络对球队进行排名,网络中最核心的球队有望成为表现最好的球队。我们的结果与我们用于预测的衡量标准密切相关。
{"title":"Which sport is becoming more predictable? A cross-discipline analysis of predictability in team sports","authors":"Michele Coscia","doi":"10.1140/epjds/s13688-024-00448-3","DOIUrl":"https://doi.org/10.1140/epjds/s13688-024-00448-3","url":null,"abstract":"<p>Professional sports are a cultural activity beloved by many, and a global hundred-billion-dollar industry. In this paper, we investigate the trends of match outcome predictability, assuming that the public is more interested in an event if there is some uncertainty about who will win. We reproduce previous methodology focused on soccer and we expand it by analyzing more than 300,000 matches in the 1996-2023 period from nine disciplines, to identify which disciplines are getting more/less predictable over time. We investigate the home advantage effect, since it can affect outcome predictability and it has been impacted by the COVID-19 pandemic. Going beyond previous work, we estimate which sport management model – between the egalitarian one popular in North America and the rich-get-richer used in Europe – leads to more uncertain outcomes. Our results show that there is no generalized trend in predictability across sport disciplines, that home advantage has been decreasing independently from the pandemic, and that sports managed with the egalitarian North American approach tend to be less predictable. We base our result on a predictive model that ranks team by analyzing the directed network of who-beats-whom, where the most central teams in the network are expected to be the best performing ones. Our results are robust to the measure we use for the prediction.</p>","PeriodicalId":11887,"journal":{"name":"EPJ Data Science","volume":"43 1","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139587346","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Modeling teams performance using deep representational learning on graphs 利用图形上的深度表征学习为团队绩效建模
IF 3.6 2区 计算机科学 Q1 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2024-01-19 DOI: 10.1140/epjds/s13688-023-00442-1
Francesco Carli, Pietro Foini, Nicolò Gozzi, Nicola Perra, Rossano Schifanella

Most human activities require collaborations within and across formal or informal teams. Our understanding of how the collaborative efforts spent by teams relate to their performance is still a matter of debate. Teamwork results in a highly interconnected ecosystem of potentially overlapping components where tasks are performed in interaction with team members and across other teams. To tackle this problem, we propose a graph neural network model to predict a team’s performance while identifying the drivers determining such outcome. In particular, the model is based on three architectural channels: topological, centrality, and contextual, which capture different factors potentially shaping teams’ success. We endow the model with two attention mechanisms to boost model performance and allow interpretability. A first mechanism allows pinpointing key members inside the team. A second mechanism allows us to quantify the contributions of the three driver effects in determining the outcome performance. We test model performance on various domains, outperforming most classical and neural baselines. Moreover, we include synthetic datasets designed to validate how the model disentangles the intended properties on which our model vastly outperforms baselines.

人类的大多数活动都需要在正式或非正式团队内部和团队之间开展协作。我们对团队的协作努力与团队绩效之间的关系仍有争议。团队合作形成了一个由潜在重叠部分组成的高度相互关联的生态系统,在这个生态系统中,任务是在与团队成员和其他团队的互动中完成的。为了解决这个问题,我们提出了一个图神经网络模型来预测团队的绩效,同时找出决定这种结果的驱动因素。特别是,该模型基于三个架构通道:拓扑、中心性和上下文,它们捕捉了可能影响团队成功的不同因素。我们为模型赋予了两种关注机制,以提高模型的性能和可解释性。第一种机制可以精确定位团队内部的关键成员。第二个机制允许我们量化三个驱动效应在决定结果绩效方面的贡献。我们在不同领域测试了模型性能,结果优于大多数经典模型和神经基线模型。此外,我们还设计了一些合成数据集,以验证模型是如何区分预期属性的,在这些属性上,我们的模型大大优于基线模型。
{"title":"Modeling teams performance using deep representational learning on graphs","authors":"Francesco Carli, Pietro Foini, Nicolò Gozzi, Nicola Perra, Rossano Schifanella","doi":"10.1140/epjds/s13688-023-00442-1","DOIUrl":"https://doi.org/10.1140/epjds/s13688-023-00442-1","url":null,"abstract":"<p>Most human activities require collaborations within and across formal or informal teams. Our understanding of how the collaborative efforts spent by teams relate to their performance is still a matter of debate. Teamwork results in a highly interconnected ecosystem of potentially overlapping components where tasks are performed in interaction with team members and across other teams. To tackle this problem, we propose a graph neural network model to predict a team’s performance while identifying the drivers determining such outcome. In particular, the model is based on three architectural channels: topological, centrality, and contextual, which capture different factors potentially shaping teams’ success. We endow the model with two attention mechanisms to boost model performance and allow interpretability. A first mechanism allows pinpointing key members inside the team. A second mechanism allows us to quantify the contributions of the three driver effects in determining the outcome performance. We test model performance on various domains, outperforming most classical and neural baselines. Moreover, we include synthetic datasets designed to validate how the model disentangles the intended properties on which our model vastly outperforms baselines.</p>","PeriodicalId":11887,"journal":{"name":"EPJ Data Science","volume":"29 1","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139508999","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Comparison of home detection algorithms using smartphone GPS data 使用智能手机 GPS 数据的住宅检测算法比较
IF 3.6 2区 计算机科学 Q1 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2024-01-16 DOI: 10.1140/epjds/s13688-023-00447-w
Rajat Verma, Shagun Mittal, Zengxiang Lei, Xiaowei Chen, Satish V. Ukkusuri

Estimation of people’s home locations using location-based services data from smartphones is a common task in human mobility assessment. However, commonly used home detection algorithms (HDAs) are often arbitrary and unexamined. In this study, we review existing HDAs and examine five HDAs using eight high-quality mobile phone geolocation datasets. These include four commonly used HDAs as well as an HDA proposed in this work. To make quantitative comparisons, we propose three novel metrics to assess the quality of detected home locations and test them on eight datasets across four U.S. cities. We find that all three metrics show a consistent rank of HDAs’ performances, with the proposed HDA outperforming the others. We infer that the temporal and spatial continuity of the geolocation data points matters more than the overall size of the data for accurate home detection. We also find that HDAs with high (and similar) performance metrics tend to create results with better consistency and closer to common expectations. Further, the performance deteriorates with decreasing data quality of the devices, though the patterns of relative performance persist. Finally, we show how the differences in home detection can lead to substantial differences in subsequent inferences using two case studies—(i) hurricane evacuation estimation, and (ii) correlation of mobility patterns with socioeconomic status. Our work contributes to improving the transparency of large-scale human mobility assessment applications.

利用智能手机提供的定位服务数据估算人们的家庭位置是人类移动性评估中的一项常见任务。然而,常用的家庭检测算法(HDAs)往往是任意的,未经研究。在本研究中,我们回顾了现有的 HDA,并使用八个高质量的手机地理定位数据集研究了五种 HDA。其中包括四种常用的 HDA 以及本文提出的一种 HDA。为了进行定量比较,我们提出了三个新指标来评估检测到的家庭位置的质量,并在美国四个城市的八个数据集上进行了测试。我们发现,所有三个指标都显示出 HDA 性能的一致排名,其中提议的 HDA 优于其他指标。我们推断,地理位置数据点在时间和空间上的连续性比数据的整体大小对准确的住宅检测更为重要。我们还发现,具有较高(和相似)性能指标的 HDA 所创建的结果往往具有较好的一致性,更接近普通预期。此外,随着设备数据质量的下降,性能也会下降,但相对性能的模式依然存在。最后,我们通过两个案例研究--(i) 飓风疏散估计和 (ii) 移动模式与社会经济地位的相关性,展示了家庭检测的差异如何导致后续推断的巨大差异。我们的工作有助于提高大规模人类流动性评估应用的透明度。
{"title":"Comparison of home detection algorithms using smartphone GPS data","authors":"Rajat Verma, Shagun Mittal, Zengxiang Lei, Xiaowei Chen, Satish V. Ukkusuri","doi":"10.1140/epjds/s13688-023-00447-w","DOIUrl":"https://doi.org/10.1140/epjds/s13688-023-00447-w","url":null,"abstract":"<p>Estimation of people’s home locations using location-based services data from smartphones is a common task in human mobility assessment. However, commonly used home detection algorithms (HDAs) are often arbitrary and unexamined. In this study, we review existing HDAs and examine five HDAs using eight high-quality mobile phone geolocation datasets. These include four commonly used HDAs as well as an HDA proposed in this work. To make quantitative comparisons, we propose three novel metrics to assess the quality of detected home locations and test them on eight datasets across four U.S. cities. We find that all three metrics show a consistent rank of HDAs’ performances, with the proposed HDA outperforming the others. We infer that the temporal and spatial continuity of the geolocation data points matters more than the overall size of the data for accurate home detection. We also find that HDAs with high (and similar) performance metrics tend to create results with better consistency and closer to common expectations. Further, the performance deteriorates with decreasing data quality of the devices, though the patterns of relative performance persist. Finally, we show how the differences in home detection can lead to substantial differences in subsequent inferences using two case studies—(i) hurricane evacuation estimation, and (ii) correlation of mobility patterns with socioeconomic status. Our work contributes to improving the transparency of large-scale human mobility assessment applications.</p>","PeriodicalId":11887,"journal":{"name":"EPJ Data Science","volume":"35 1","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139474602","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
What relational event models can reveal: Commentary on Thomas Grund’s “Dynamics of Denunciation: The Limits of a Scandal” 关系事件模型能揭示什么?对托马斯-格伦德《谴责的动力》的评论:丑闻的局限性
IF 3.6 2区 计算机科学 Q1 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2024-01-11 DOI: 10.1140/epjds/s13688-023-00432-3

Abstract

This article provides a commentary on Thomas Grund’s International Conference on Computational Social Science 2021 keynote “Dynamics of Denunciation: The Limits of a Scandal”. The keynote presents results from research investigating the relational dynamics underpinning the denunciations provided in testimonies relating to a Canadian political scandal. Grund uses relational event models to test hypotheses about the social mechanisms driving the denunciations. Although denunciation should depend only on who is guilty and not on who has said what up to that point, Grund’s study finds evidence in support of a number of relational mechanisms influencing the denunciation process. Grund argues that the apparent influence of past denunciations on testimonies reveals the limits of the inquiry process itself and what it can reveal about a scandal. This article reviews Grund’s talk and puts the work in a broader context of using approaches rooted in event history modelling and social network theory to illuminate the processes defining social interaction data. It highlights ways in which the keynote can inform the development of computational social science approaches to analysing such data, and argues that the value of such an analysis has implications for scholarship beyond the social sciences.

摘要 本文对托马斯-格伦德(Thomas Grund)在 2021 年计算社会科学国际会议上发表的主题演讲 "谴责的动态:丑闻的局限性 "的主题演讲。该主题演讲介绍了对加拿大政治丑闻证词中的告发所依据的关系动态的研究成果。格伦德使用关系事件模型来检验有关驱动告发的社会机制的假设。虽然告发应该只取决于谁有罪,而不是取决于谁在告发前说了什么,但格伦德的研究发现了一些证据,支持影响告发过程的关系机制。格伦德认为,过去的告发对证词的明显影响揭示了调查过程本身的局限性及其对丑闻的揭示。本文回顾了格伦德的演讲,并将这项工作置于一个更广阔的背景下,即使用植根于事件历史建模和社会网络理论的方法来阐明界定社会互动数据的过程。文章强调了该主题演讲可为分析此类数据的计算社会科学方法的发展提供信息的方式,并认为此类分析的价值对社会科学以外的学术研究也有影响。
{"title":"What relational event models can reveal: Commentary on Thomas Grund’s “Dynamics of Denunciation: The Limits of a Scandal”","authors":"","doi":"10.1140/epjds/s13688-023-00432-3","DOIUrl":"https://doi.org/10.1140/epjds/s13688-023-00432-3","url":null,"abstract":"<h3>Abstract</h3> <p>This article provides a commentary on Thomas Grund’s International Conference on Computational Social Science 2021 keynote “Dynamics of Denunciation: The Limits of a Scandal”. The keynote presents results from research investigating the relational dynamics underpinning the denunciations provided in testimonies relating to a Canadian political scandal. Grund uses relational event models to test hypotheses about the social mechanisms driving the denunciations. Although denunciation should depend only on who is guilty and not on who has said what up to that point, Grund’s study finds evidence in support of a number of relational mechanisms influencing the denunciation process. Grund argues that the apparent influence of past denunciations on testimonies reveals the limits of the inquiry process itself and what it can reveal about a scandal. This article reviews Grund’s talk and puts the work in a broader context of using approaches rooted in event history modelling and social network theory to illuminate the processes defining social interaction data. It highlights ways in which the keynote can inform the development of computational social science approaches to analysing such data, and argues that the value of such an analysis has implications for scholarship beyond the social sciences.</p>","PeriodicalId":11887,"journal":{"name":"EPJ Data Science","volume":"86 1","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-01-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139422523","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
EPJ Data Science
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1