Pub Date : 2024-02-26DOI: 10.1140/epjds/s13688-023-00433-2
Sarah Shugars
In her 2021 IC2S2 keynote talk, “Critical Data Theory,” Margaret Hu builds off Critical Race Theory, privacy law, and big data surveillance to grapple with questions at the intersection of big data and legal jurisprudence. As a legal scholar, Hu’s work focuses primarily on issues of governance and regulation—examining the legal and constitutional impact of modern data collection and analysis. Yet, her call for Critical Data Theory has important implications for the field of Computational Social Science (CSS) as a whole. In this article, I therefore reflect on Hu’s conception of Critical Data Theory and its broader implications for CSS research. Specifically, I’ll consider the ramifications of her work for the scientific community—exploring how we as researchers should think about the ethics and realities of the data which forms the foundations of our work.
{"title":"Critical computational social science","authors":"Sarah Shugars","doi":"10.1140/epjds/s13688-023-00433-2","DOIUrl":"https://doi.org/10.1140/epjds/s13688-023-00433-2","url":null,"abstract":"<p>In her 2021 IC2S2 keynote talk, “Critical Data Theory,” Margaret Hu builds off Critical Race Theory, privacy law, and big data surveillance to grapple with questions at the intersection of big data and legal jurisprudence. As a legal scholar, Hu’s work focuses primarily on issues of governance and regulation—examining the legal and constitutional impact of modern data collection and analysis. Yet, her call for Critical Data Theory has important implications for the field of Computational Social Science (CSS) as a whole. In this article, I therefore reflect on Hu’s conception of Critical Data Theory and its broader implications for CSS research. Specifically, I’ll consider the ramifications of her work for the scientific community—exploring how we as researchers should think about the ethics and realities of the data which forms the foundations of our work.</p>","PeriodicalId":11887,"journal":{"name":"EPJ Data Science","volume":"57 1","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139980944","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-02-26DOI: 10.1140/epjds/s13688-023-00443-0
Abstract
Deductive and theory-driven research starts by asking questions. Finding tentative answers to these questions in the literature is next. It is followed by gathering, preparing and modelling relevant data to empirically test these tentative answers. Inductive research, on the other hand, starts with data representation and finding general patterns in data. Ahn suggested, in his keynote speech at the seventh International Conference on Computational Social Science (IC2S2) 2021, that the way this data is represented could shape our understanding and the type of answers we find for the questions. He discussed that specific representation learning approaches enable a meaningful embedding space and could allow spatial thinking and broaden computational imagination. In this commentary, I summarize Ahn’s keynote and related publications, provide an overview of the use of spatial metaphor in sociology, discuss how such representation learning can help both inductive and deductive research, propose future avenues of research that could benefit from spatial thinking, and pose some still open questions.
{"title":"Thinking spatially in computational social science","authors":"","doi":"10.1140/epjds/s13688-023-00443-0","DOIUrl":"https://doi.org/10.1140/epjds/s13688-023-00443-0","url":null,"abstract":"<h3>Abstract</h3> <p>Deductive and theory-driven research starts by asking questions. Finding tentative answers to these questions in the literature is next. It is followed by gathering, preparing and modelling relevant data to empirically test these tentative answers. Inductive research, on the other hand, starts with data representation and finding general patterns in data. Ahn suggested, in his keynote speech at the seventh International Conference on Computational Social Science (IC<sup>2</sup>S<sup>2</sup>) 2021, that the way this data is represented could shape our understanding and the type of answers we find for the questions. He discussed that specific representation learning approaches enable a meaningful embedding space and could allow spatial thinking and broaden computational imagination. In this commentary, I summarize Ahn’s keynote and related publications, provide an overview of the use of spatial metaphor in sociology, discuss how such representation learning can help both inductive and deductive research, propose future avenues of research that could benefit from spatial thinking, and pose some still open questions.</p>","PeriodicalId":11887,"journal":{"name":"EPJ Data Science","volume":"22 1","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139981035","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-02-20DOI: 10.1140/epjds/s13688-024-00451-8
Abstract
From small steps to great leaps, metaphors of spatial mobility abound to describe discovery processes. Here, we ground these ideas in formal terms by systematically studying mobility patterns in the scientific knowledge landscape. We use low-dimensional embedding techniques to create a knowledge space made up of 1.5 million articles from the fields of physics, computer science, and mathematics. By analyzing the publication histories of individual researchers, we discover patterns of scientific mobility that closely resemble physical mobility. In aggregate, the trajectories form mobility flows that can be described by a gravity model, with jumps more likely to occur in areas of high density and less likely to occur over longer distances. We identify two types of researchers from their individual mobility patterns: interdisciplinary explorers who pioneer new fields, and exploiters who are more likely to stay within their specific areas of expertise. Our results suggest that spatial mobility analysis is a valuable tool for understanding the evolution of science.
{"title":"Charting mobility patterns in the scientific knowledge landscape","authors":"","doi":"10.1140/epjds/s13688-024-00451-8","DOIUrl":"https://doi.org/10.1140/epjds/s13688-024-00451-8","url":null,"abstract":"<h3>Abstract</h3> <p>From small steps to great leaps, metaphors of spatial mobility abound to describe discovery processes. Here, we ground these ideas in formal terms by systematically studying mobility patterns in the scientific knowledge landscape. We use low-dimensional embedding techniques to create a knowledge space made up of 1.5 million articles from the fields of physics, computer science, and mathematics. By analyzing the publication histories of individual researchers, we discover patterns of scientific mobility that closely resemble physical mobility. In aggregate, the trajectories form mobility flows that can be described by a gravity model, with jumps more likely to occur in areas of high density and less likely to occur over longer distances. We identify two types of researchers from their individual mobility patterns: interdisciplinary <em>explorers</em> who pioneer new fields, and <em>exploiters</em> who are more likely to stay within their specific areas of expertise. Our results suggest that spatial mobility analysis is a valuable tool for understanding the evolution of science.</p>","PeriodicalId":11887,"journal":{"name":"EPJ Data Science","volume":"17 1","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139925078","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-02-20DOI: 10.1140/epjds/s13688-023-00446-x
Luca Mungo, Silvia Bartolucci, Laura Alessandretti
Since the introduction of Bitcoin in 2009, the dramatic and unsteady evolution of the cryptocurrency market has also been driven by large investments by traditional and cryptocurrency-focused hedge funds. Notwithstanding their critical role, our understanding of the relationship between institutional investments and the evolution of the cryptocurrency market has remained limited, also due to the lack of comprehensive data describing investments over time. In this study, we present a quantitative study of cryptocurrency institutional investments based on a dataset collected for 1324 currencies in the period between 2014 and 2022 from Crunchbase, one of the largest platforms gathering business information. We show that the evolution of the cryptocurrency market capitalization is highly correlated with the size of institutional investments, thus confirming their important role. Further, we find that the market is dominated by the presence of a group of prominent investors who tend to specialise by focusing on particular technologies. Finally, studying the co-investment network of currencies that share common investors, we show that assets with shared investors tend to be characterized by similar market behaviour. Our work sheds light on the role played by institutional investors and provides a basis for further research on their influence in the cryptocurrency ecosystem.
{"title":"Cryptocurrency co-investment network: token returns reflect investment patterns","authors":"Luca Mungo, Silvia Bartolucci, Laura Alessandretti","doi":"10.1140/epjds/s13688-023-00446-x","DOIUrl":"https://doi.org/10.1140/epjds/s13688-023-00446-x","url":null,"abstract":"<p>Since the introduction of Bitcoin in 2009, the dramatic and unsteady evolution of the cryptocurrency market has also been driven by large investments by traditional and cryptocurrency-focused hedge funds. Notwithstanding their critical role, our understanding of the relationship between institutional investments and the evolution of the cryptocurrency market has remained limited, also due to the lack of comprehensive data describing investments over time. In this study, we present a quantitative study of cryptocurrency institutional investments based on a dataset collected for 1324 currencies in the period between 2014 and 2022 from Crunchbase, one of the largest platforms gathering business information. We show that the evolution of the cryptocurrency market capitalization is highly correlated with the size of institutional investments, thus confirming their important role. Further, we find that the market is dominated by the presence of a group of prominent investors who tend to specialise by focusing on particular technologies. Finally, studying the co-investment network of currencies that share common investors, we show that assets with shared investors tend to be characterized by similar market behaviour. Our work sheds light on the role played by institutional investors and provides a basis for further research on their influence in the cryptocurrency ecosystem.</p>","PeriodicalId":11887,"journal":{"name":"EPJ Data Science","volume":"23 1","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139925405","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-01-31DOI: 10.1140/epjds/s13688-024-00449-2
Manjin Shao, Hong Fan
The indirect correlation among financial institutions, stemming from similarities in their portfolios, is a primary driver of systemic risk. However, most existing research overlooks the influence of portfolio similarity among various types of financial institutions on this risk. Therefore, we construct the network of portfolio similarity correlations among different types of financial institutions, based on measurements of portfolio similarity. Utilizing the expanded fire sale contagion model, we offer a comprehensive assessment of systemic risk for Chinese financial institutions. Initially, we introduce indicators for systemic risk, systemic importance, and systemic vulnerability. Subsequently, we examine the cross-sectional and time-series characteristics of these institutions’ systemic importance and vulnerability within the context of the portfolio similarity correlation network. Our empirical findings reveal a high degree of portfolio similarity between banks and insurance companies, contrasted with lower similarity between banks and securities firms. Moreover, when considering the portfolio similarity correlation network, both the systemic importance and vulnerability of Chinese banks and insurance companies surpass those of securities firms in both cross-sectional and temporal dimensions. Notably, our analysis further illustrates that a financial institution’s systemic importance and vulnerability are strongly and positively associated with the magnitude of portfolio similarity between that institution and others.
{"title":"Identifying the systemic importance and systemic vulnerability of financial institutions based on portfolio similarity correlation network","authors":"Manjin Shao, Hong Fan","doi":"10.1140/epjds/s13688-024-00449-2","DOIUrl":"https://doi.org/10.1140/epjds/s13688-024-00449-2","url":null,"abstract":"<p>The indirect correlation among financial institutions, stemming from similarities in their portfolios, is a primary driver of systemic risk. However, most existing research overlooks the influence of portfolio similarity among various types of financial institutions on this risk. Therefore, we construct the network of portfolio similarity correlations among different types of financial institutions, based on measurements of portfolio similarity. Utilizing the expanded fire sale contagion model, we offer a comprehensive assessment of systemic risk for Chinese financial institutions. Initially, we introduce indicators for systemic risk, systemic importance, and systemic vulnerability. Subsequently, we examine the cross-sectional and time-series characteristics of these institutions’ systemic importance and vulnerability within the context of the portfolio similarity correlation network. Our empirical findings reveal a high degree of portfolio similarity between banks and insurance companies, contrasted with lower similarity between banks and securities firms. Moreover, when considering the portfolio similarity correlation network, both the systemic importance and vulnerability of Chinese banks and insurance companies surpass those of securities firms in both cross-sectional and temporal dimensions. Notably, our analysis further illustrates that a financial institution’s systemic importance and vulnerability are strongly and positively associated with the magnitude of portfolio similarity between that institution and others.</p>","PeriodicalId":11887,"journal":{"name":"EPJ Data Science","volume":"2 1","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139645070","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-01-31DOI: 10.1140/epjds/s13688-024-00450-9
Bao Tran Truong, Oliver Melbourne Allen, Filippo Menczer
The spread of misinformation poses a threat to the social media ecosystem. Effective countermeasures to mitigate this threat require that social media platforms be able to accurately detect low-credibility accounts even before the content they share can be classified as misinformation. Here we present methods to infer account credibility from information diffusion patterns, in particular leveraging two networks: the reshare network, capturing an account’s trust in other accounts, and the bipartite account-source network, capturing an account’s trust in media sources. We extend network centrality measures and graph embedding techniques, systematically comparing these algorithms on data from diverse contexts and social media platforms. We demonstrate that both kinds of trust networks provide useful signals for estimating account credibility. Some of the proposed methods yield high accuracy, providing promising solutions to promote the dissemination of reliable information in online communities. Two kinds of homophily emerge from our results: accounts tend to have similar credibility if they reshare each other’s content or share content from similar sources. Our methodology invites further investigation into the relationship between accounts and news sources to better characterize misinformation spreaders.
{"title":"Account credibility inference based on news-sharing networks","authors":"Bao Tran Truong, Oliver Melbourne Allen, Filippo Menczer","doi":"10.1140/epjds/s13688-024-00450-9","DOIUrl":"https://doi.org/10.1140/epjds/s13688-024-00450-9","url":null,"abstract":"<p>The spread of misinformation poses a threat to the social media ecosystem. Effective countermeasures to mitigate this threat require that social media platforms be able to accurately detect low-credibility accounts even before the content they share can be classified as misinformation. Here we present methods to infer account credibility from information diffusion patterns, in particular leveraging two networks: the reshare network, capturing an account’s trust in other accounts, and the bipartite account-source network, capturing an account’s trust in media sources. We extend network centrality measures and graph embedding techniques, systematically comparing these algorithms on data from diverse contexts and social media platforms. We demonstrate that both kinds of trust networks provide useful signals for estimating account credibility. Some of the proposed methods yield high accuracy, providing promising solutions to promote the dissemination of reliable information in online communities. Two kinds of homophily emerge from our results: accounts tend to have similar credibility if they reshare each other’s content or share content from similar sources. Our methodology invites further investigation into the relationship between accounts and news sources to better characterize misinformation spreaders.</p>","PeriodicalId":11887,"journal":{"name":"EPJ Data Science","volume":"23 1","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139645231","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-01-29DOI: 10.1140/epjds/s13688-024-00448-3
Michele Coscia
Professional sports are a cultural activity beloved by many, and a global hundred-billion-dollar industry. In this paper, we investigate the trends of match outcome predictability, assuming that the public is more interested in an event if there is some uncertainty about who will win. We reproduce previous methodology focused on soccer and we expand it by analyzing more than 300,000 matches in the 1996-2023 period from nine disciplines, to identify which disciplines are getting more/less predictable over time. We investigate the home advantage effect, since it can affect outcome predictability and it has been impacted by the COVID-19 pandemic. Going beyond previous work, we estimate which sport management model – between the egalitarian one popular in North America and the rich-get-richer used in Europe – leads to more uncertain outcomes. Our results show that there is no generalized trend in predictability across sport disciplines, that home advantage has been decreasing independently from the pandemic, and that sports managed with the egalitarian North American approach tend to be less predictable. We base our result on a predictive model that ranks team by analyzing the directed network of who-beats-whom, where the most central teams in the network are expected to be the best performing ones. Our results are robust to the measure we use for the prediction.
{"title":"Which sport is becoming more predictable? A cross-discipline analysis of predictability in team sports","authors":"Michele Coscia","doi":"10.1140/epjds/s13688-024-00448-3","DOIUrl":"https://doi.org/10.1140/epjds/s13688-024-00448-3","url":null,"abstract":"<p>Professional sports are a cultural activity beloved by many, and a global hundred-billion-dollar industry. In this paper, we investigate the trends of match outcome predictability, assuming that the public is more interested in an event if there is some uncertainty about who will win. We reproduce previous methodology focused on soccer and we expand it by analyzing more than 300,000 matches in the 1996-2023 period from nine disciplines, to identify which disciplines are getting more/less predictable over time. We investigate the home advantage effect, since it can affect outcome predictability and it has been impacted by the COVID-19 pandemic. Going beyond previous work, we estimate which sport management model – between the egalitarian one popular in North America and the rich-get-richer used in Europe – leads to more uncertain outcomes. Our results show that there is no generalized trend in predictability across sport disciplines, that home advantage has been decreasing independently from the pandemic, and that sports managed with the egalitarian North American approach tend to be less predictable. We base our result on a predictive model that ranks team by analyzing the directed network of who-beats-whom, where the most central teams in the network are expected to be the best performing ones. Our results are robust to the measure we use for the prediction.</p>","PeriodicalId":11887,"journal":{"name":"EPJ Data Science","volume":"43 1","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139587346","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-01-19DOI: 10.1140/epjds/s13688-023-00442-1
Francesco Carli, Pietro Foini, Nicolò Gozzi, Nicola Perra, Rossano Schifanella
Most human activities require collaborations within and across formal or informal teams. Our understanding of how the collaborative efforts spent by teams relate to their performance is still a matter of debate. Teamwork results in a highly interconnected ecosystem of potentially overlapping components where tasks are performed in interaction with team members and across other teams. To tackle this problem, we propose a graph neural network model to predict a team’s performance while identifying the drivers determining such outcome. In particular, the model is based on three architectural channels: topological, centrality, and contextual, which capture different factors potentially shaping teams’ success. We endow the model with two attention mechanisms to boost model performance and allow interpretability. A first mechanism allows pinpointing key members inside the team. A second mechanism allows us to quantify the contributions of the three driver effects in determining the outcome performance. We test model performance on various domains, outperforming most classical and neural baselines. Moreover, we include synthetic datasets designed to validate how the model disentangles the intended properties on which our model vastly outperforms baselines.
{"title":"Modeling teams performance using deep representational learning on graphs","authors":"Francesco Carli, Pietro Foini, Nicolò Gozzi, Nicola Perra, Rossano Schifanella","doi":"10.1140/epjds/s13688-023-00442-1","DOIUrl":"https://doi.org/10.1140/epjds/s13688-023-00442-1","url":null,"abstract":"<p>Most human activities require collaborations within and across formal or informal teams. Our understanding of how the collaborative efforts spent by teams relate to their performance is still a matter of debate. Teamwork results in a highly interconnected ecosystem of potentially overlapping components where tasks are performed in interaction with team members and across other teams. To tackle this problem, we propose a graph neural network model to predict a team’s performance while identifying the drivers determining such outcome. In particular, the model is based on three architectural channels: topological, centrality, and contextual, which capture different factors potentially shaping teams’ success. We endow the model with two attention mechanisms to boost model performance and allow interpretability. A first mechanism allows pinpointing key members inside the team. A second mechanism allows us to quantify the contributions of the three driver effects in determining the outcome performance. We test model performance on various domains, outperforming most classical and neural baselines. Moreover, we include synthetic datasets designed to validate how the model disentangles the intended properties on which our model vastly outperforms baselines.</p>","PeriodicalId":11887,"journal":{"name":"EPJ Data Science","volume":"29 1","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139508999","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-01-16DOI: 10.1140/epjds/s13688-023-00447-w
Rajat Verma, Shagun Mittal, Zengxiang Lei, Xiaowei Chen, Satish V. Ukkusuri
Estimation of people’s home locations using location-based services data from smartphones is a common task in human mobility assessment. However, commonly used home detection algorithms (HDAs) are often arbitrary and unexamined. In this study, we review existing HDAs and examine five HDAs using eight high-quality mobile phone geolocation datasets. These include four commonly used HDAs as well as an HDA proposed in this work. To make quantitative comparisons, we propose three novel metrics to assess the quality of detected home locations and test them on eight datasets across four U.S. cities. We find that all three metrics show a consistent rank of HDAs’ performances, with the proposed HDA outperforming the others. We infer that the temporal and spatial continuity of the geolocation data points matters more than the overall size of the data for accurate home detection. We also find that HDAs with high (and similar) performance metrics tend to create results with better consistency and closer to common expectations. Further, the performance deteriorates with decreasing data quality of the devices, though the patterns of relative performance persist. Finally, we show how the differences in home detection can lead to substantial differences in subsequent inferences using two case studies—(i) hurricane evacuation estimation, and (ii) correlation of mobility patterns with socioeconomic status. Our work contributes to improving the transparency of large-scale human mobility assessment applications.
{"title":"Comparison of home detection algorithms using smartphone GPS data","authors":"Rajat Verma, Shagun Mittal, Zengxiang Lei, Xiaowei Chen, Satish V. Ukkusuri","doi":"10.1140/epjds/s13688-023-00447-w","DOIUrl":"https://doi.org/10.1140/epjds/s13688-023-00447-w","url":null,"abstract":"<p>Estimation of people’s home locations using location-based services data from smartphones is a common task in human mobility assessment. However, commonly used home detection algorithms (HDAs) are often arbitrary and unexamined. In this study, we review existing HDAs and examine five HDAs using eight high-quality mobile phone geolocation datasets. These include four commonly used HDAs as well as an HDA proposed in this work. To make quantitative comparisons, we propose three novel metrics to assess the quality of detected home locations and test them on eight datasets across four U.S. cities. We find that all three metrics show a consistent rank of HDAs’ performances, with the proposed HDA outperforming the others. We infer that the temporal and spatial continuity of the geolocation data points matters more than the overall size of the data for accurate home detection. We also find that HDAs with high (and similar) performance metrics tend to create results with better consistency and closer to common expectations. Further, the performance deteriorates with decreasing data quality of the devices, though the patterns of relative performance persist. Finally, we show how the differences in home detection can lead to substantial differences in subsequent inferences using two case studies—(i) hurricane evacuation estimation, and (ii) correlation of mobility patterns with socioeconomic status. Our work contributes to improving the transparency of large-scale human mobility assessment applications.</p>","PeriodicalId":11887,"journal":{"name":"EPJ Data Science","volume":"35 1","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139474602","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-01-11DOI: 10.1140/epjds/s13688-023-00432-3
Abstract
This article provides a commentary on Thomas Grund’s International Conference on Computational Social Science 2021 keynote “Dynamics of Denunciation: The Limits of a Scandal”. The keynote presents results from research investigating the relational dynamics underpinning the denunciations provided in testimonies relating to a Canadian political scandal. Grund uses relational event models to test hypotheses about the social mechanisms driving the denunciations. Although denunciation should depend only on who is guilty and not on who has said what up to that point, Grund’s study finds evidence in support of a number of relational mechanisms influencing the denunciation process. Grund argues that the apparent influence of past denunciations on testimonies reveals the limits of the inquiry process itself and what it can reveal about a scandal. This article reviews Grund’s talk and puts the work in a broader context of using approaches rooted in event history modelling and social network theory to illuminate the processes defining social interaction data. It highlights ways in which the keynote can inform the development of computational social science approaches to analysing such data, and argues that the value of such an analysis has implications for scholarship beyond the social sciences.
{"title":"What relational event models can reveal: Commentary on Thomas Grund’s “Dynamics of Denunciation: The Limits of a Scandal”","authors":"","doi":"10.1140/epjds/s13688-023-00432-3","DOIUrl":"https://doi.org/10.1140/epjds/s13688-023-00432-3","url":null,"abstract":"<h3>Abstract</h3> <p>This article provides a commentary on Thomas Grund’s International Conference on Computational Social Science 2021 keynote “Dynamics of Denunciation: The Limits of a Scandal”. The keynote presents results from research investigating the relational dynamics underpinning the denunciations provided in testimonies relating to a Canadian political scandal. Grund uses relational event models to test hypotheses about the social mechanisms driving the denunciations. Although denunciation should depend only on who is guilty and not on who has said what up to that point, Grund’s study finds evidence in support of a number of relational mechanisms influencing the denunciation process. Grund argues that the apparent influence of past denunciations on testimonies reveals the limits of the inquiry process itself and what it can reveal about a scandal. This article reviews Grund’s talk and puts the work in a broader context of using approaches rooted in event history modelling and social network theory to illuminate the processes defining social interaction data. It highlights ways in which the keynote can inform the development of computational social science approaches to analysing such data, and argues that the value of such an analysis has implications for scholarship beyond the social sciences.</p>","PeriodicalId":11887,"journal":{"name":"EPJ Data Science","volume":"86 1","pages":""},"PeriodicalIF":3.6,"publicationDate":"2024-01-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139422523","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}