The count of open source software packages hosted by the Comprehensive R Archive Network (CRAN) using key spatial data handling packages has now passed 1,000. Providing a comprehensive review of these packages is beyond the scope of an article. Consequently, this review takes the form of a comparative case study, reproducing some of the approach and workflow of a spatial analysis of a data set including almost all the census tracts in the coterminous United States. The case study moves from visualization and the construction of a spatial weights matrix, to exploratory spatial data analysis and spatial regression. For comparison, implementations of the same steps in PySAL and GeoDa are interwoven, and points of convergence and divergence noted and discussed. Conclusions are drawn about the usefulness of open source software, the significance of sharing contributions both in software implementation but also more broadly in reproducible research, and in opportunities for exchanging ideas and solutions with other research domains.
{"title":"R Packages for Analyzing Spatial Data: A Comparative Case Study with Areal Data","authors":"Roger Bivand","doi":"10.1111/gean.12319","DOIUrl":"10.1111/gean.12319","url":null,"abstract":"<p>The count of open source software packages hosted by the Comprehensive R Archive Network (CRAN) using key spatial data handling packages has now passed 1,000. Providing a comprehensive review of these packages is beyond the scope of an article. Consequently, this review takes the form of a comparative case study, reproducing some of the approach and workflow of a spatial analysis of a data set including almost all the census tracts in the coterminous United States. The case study moves from visualization and the construction of a spatial weights matrix, to exploratory spatial data analysis and spatial regression. For comparison, implementations of the same steps in PySAL and GeoDa are interwoven, and points of convergence and divergence noted and discussed. Conclusions are drawn about the usefulness of open source software, the significance of sharing contributions both in software implementation but also more broadly in reproducible research, and in opportunities for exchanging ideas and solutions with other research domains.</p>","PeriodicalId":12533,"journal":{"name":"Geographical Analysis","volume":"54 3","pages":"488-518"},"PeriodicalIF":3.6,"publicationDate":"2022-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/gean.12319","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48945634","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This article focuses on the estimation of the spatial Durbin model and associated relative impacts implemented in the R library sphet. Specifically, the current version of the library offers two ways of performing inference: one based on drawing samples from a multivariate normal distribution, and the other based on an analytical formula. The performance of these two methods is compared using an extensive Monte Carlo experiment. As an illustration of the kind of analysis that can be performed with sphet, the article also presents an empirical application looking at economic growth of Italian provinces.
{"title":"A deeper look at impacts in spatial Durbin model with sphet","authors":"Gianfranco Piras, Paolo Postiglione","doi":"10.1111/gean.12318","DOIUrl":"10.1111/gean.12318","url":null,"abstract":"<p>This article focuses on the estimation of the spatial Durbin model and associated relative impacts implemented in the R library <b>sphet</b>. Specifically, the current version of the library offers two ways of performing inference: one based on drawing samples from a multivariate normal distribution, and the other based on an analytical formula. The performance of these two methods is compared using an extensive Monte Carlo experiment. As an illustration of the kind of analysis that can be performed with <b>sphet</b>, the article also presents an empirical application looking at economic growth of Italian provinces.</p>","PeriodicalId":12533,"journal":{"name":"Geographical Analysis","volume":"54 3","pages":"664-684"},"PeriodicalIF":3.6,"publicationDate":"2022-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44665605","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Alexis Comber, Christopher Brunsdon, Martin Charlton, Guanpeng Dong, Richard Harris, Binbin Lu, Yihe Lü, Daisuke Murakami, Tomoki Nakaya, Yunqiang Wang, Paul Harris
Geographically Weighted Regression (GWR) is increasingly used in spatial analyses of social and environmental data. It allows spatial heterogeneities in processes and relationships to be investigated through a series of local regression models rather than a single global one. Standard GWR assumes that relationships between the response and predictor variables operate at the same spatial scale, which is frequently not the case. To address this, several GWR variants have been proposed. This paper describes a route map to decide whether to use a GWR model or not, and if so which of three core variants to apply: a standard GWR, a mixed GWR or a multiscale GWR (MS-GWR). The route map comprises 3 primary steps that should always be undertaken: (1) a basic linear regression, (2) a MS-GWR, and (3) investigations of the results of these in order to decide whether to use a GWR approach, and if so for determining the appropriate GWR variant. The paper also highlights the importance of investigating a number of secondary issues at global and local scales including collinearity, the influence of outliers, and dependent error terms. Code and data for the case study used to illustrate the route map are provided.
{"title":"A Route Map for Successful Applications of Geographically Weighted Regression","authors":"Alexis Comber, Christopher Brunsdon, Martin Charlton, Guanpeng Dong, Richard Harris, Binbin Lu, Yihe Lü, Daisuke Murakami, Tomoki Nakaya, Yunqiang Wang, Paul Harris","doi":"10.1111/gean.12316","DOIUrl":"https://doi.org/10.1111/gean.12316","url":null,"abstract":"<p>Geographically Weighted Regression (GWR) is increasingly used in spatial analyses of social and environmental data. It allows spatial heterogeneities in processes and relationships to be investigated through a series of local regression models rather than a single global one. Standard GWR assumes that relationships between the response and predictor variables operate at the same spatial scale, which is frequently not the case. To address this, several GWR variants have been proposed. This paper describes a route map to decide whether to use a GWR model or not, and if so which of three core variants to apply: a standard GWR, a mixed GWR or a multiscale GWR (MS-GWR). The route map comprises 3 primary steps that should always be undertaken: (1) a basic linear regression, (2) a MS-GWR, and (3) investigations of the results of these in order to decide whether to use a GWR approach, and if so for determining the appropriate GWR variant. The paper also highlights the importance of investigating a number of secondary issues at global and local scales including collinearity, the influence of outliers, and dependent error terms. Code and data for the case study used to illustrate the route map are provided.</p>","PeriodicalId":12533,"journal":{"name":"Geographical Analysis","volume":"55 1","pages":"155-178"},"PeriodicalIF":3.6,"publicationDate":"2022-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/gean.12316","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50142834","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Chiara Ghiringhelli, Gianfranco Piras, Giuseppe Arbia, Antonietta Mira
In this paper, we propose a recursive approach to estimate the spatial error model. We compare the suggested methodology with standard estimation procedures and we report a set of Monte Carlo experiments which show that the recursive approach substantially reduces the computational effort affecting the precision of the estimators within reasonable limits. The proposed technique can prove helpful when applied to real-time streams of geographical data that are becoming increasingly available in the big data era. Finally, we illustrate this methodology using a set of earthquake data.
{"title":"Recursive Estimation of the Spatial Error Model","authors":"Chiara Ghiringhelli, Gianfranco Piras, Giuseppe Arbia, Antonietta Mira","doi":"10.1111/gean.12317","DOIUrl":"10.1111/gean.12317","url":null,"abstract":"<p>In this paper, we propose a recursive approach to estimate the spatial error model. We compare the suggested methodology with standard estimation procedures and we report a set of Monte Carlo experiments which show that the recursive approach substantially reduces the computational effort affecting the precision of the estimators within reasonable limits. The proposed technique can prove helpful when applied to real-time streams of geographical data that are becoming increasingly available in the big data era. Finally, we illustrate this methodology using a set of earthquake data.</p>","PeriodicalId":12533,"journal":{"name":"Geographical Analysis","volume":"55 1","pages":"90-106"},"PeriodicalIF":3.6,"publicationDate":"2022-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42490531","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The growing interest in causal inference in recent years has led to new causal inference methodologies and their applications across disciplines and research domains. Yet, studies on spatial causal inference are still rare. Causal inference on spatial processes is faced with additional challenges, such as spatial dependency, spatial heterogeneity, and spatial effects. These challenges can lead to spurious results and subsequently, incorrect interpretations of the outcomes of causal analyses. Recognizing the growing importance of causal inference in the spatial domain, we conduct a systematic literature review on spatial causal inference based on a formal concept mapping. To identify how to assess and control for the adverse effects of spatial influences, we assess publications relevant to spatial causal inference based on criteria relating to application discipline, methods used, and techniques applied for managing issues related to spatial processes. We thus present a snapshot of state of the art in spatial causal inference and identify methodological gaps, weaknesses and challenges of current spatial inference studies, along with opportunities for future research.
{"title":"Spatial Causality: A Systematic Review on Spatial Causal Inference","authors":"Kamal Akbari, Stephan Winter, Martin Tomko","doi":"10.1111/gean.12312","DOIUrl":"10.1111/gean.12312","url":null,"abstract":"<p>The growing interest in causal inference in recent years has led to new causal inference methodologies and their applications across disciplines and research domains. Yet, studies on <i>spatial</i> causal inference are still rare. Causal inference on spatial processes is faced with additional challenges, such as spatial dependency, spatial heterogeneity, and spatial effects. These challenges can lead to spurious results and subsequently, incorrect interpretations of the outcomes of causal analyses. Recognizing the growing importance of causal inference in the spatial domain, we conduct a systematic literature review on spatial causal inference based on a formal concept mapping. To identify how to assess and control for the adverse effects of spatial influences, we assess publications relevant to spatial causal inference based on criteria relating to application discipline, methods used, and techniques applied for managing issues related to spatial processes. We thus present a snapshot of state of the art in spatial causal inference and identify methodological gaps, weaknesses and challenges of current spatial inference studies, along with opportunities for future research.</p>","PeriodicalId":12533,"journal":{"name":"Geographical Analysis","volume":"55 1","pages":"56-89"},"PeriodicalIF":3.6,"publicationDate":"2021-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49363047","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Since its introduction more than 15 years ago, the GeoDa software for the exploration of spatial data has transitioned from a closed source Windows-only solution to an open source and cross-platform product that takes on the look and feel of the native operating system. This article reports on the evolution in the functionality and architecture of the software and pays particular attention to its new implementation as a library, libgeoda. This library, through a clearly structured API, can be integrated into other software environments, such as R (rgeoda) and Python (pygeoda). This integration is illustrated with two small empirical examples, investigating local clusters in a historical London cholera data set and among socioeconomic determinants of health in Chicago. A timing experiment demonstrates the competitive performance of GeoDa desktop, libgeoda (C++), rgeoda and pygeoda compared to established solutions in R spdep and Python PySAL, evaluating conditional permutation inference for the Local Moran statistic.
{"title":"GeoDa, From the Desktop to an Ecosystem for Exploring Spatial Data","authors":"Luc Anselin, Xun Li, Julia Koschinsky","doi":"10.1111/gean.12311","DOIUrl":"10.1111/gean.12311","url":null,"abstract":"<p>Since its introduction more than 15 years ago, the GeoDa software for the exploration of spatial data has transitioned from a closed source Windows-only solution to an open source and cross-platform product that takes on the look and feel of the native operating system. This article reports on the evolution in the functionality and architecture of the software and pays particular attention to its new implementation as a library, libgeoda. This library, through a clearly structured API, can be integrated into other software environments, such as R (rgeoda) and Python (pygeoda). This integration is illustrated with two small empirical examples, investigating local clusters in a historical London cholera data set and among socioeconomic determinants of health in Chicago. A timing experiment demonstrates the competitive performance of GeoDa desktop, libgeoda (C++), rgeoda and pygeoda compared to established solutions in R spdep and Python PySAL, evaluating conditional permutation inference for the Local Moran statistic.</p>","PeriodicalId":12533,"journal":{"name":"Geographical Analysis","volume":"54 3","pages":"439-466"},"PeriodicalIF":3.6,"publicationDate":"2021-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46860040","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The emergence of the novel SARS-CoV-2 coronavirus and the global COVID-19 pandemic in 2019 led to explosive growth in scientific research. Alas, much of the research in the literature lacks conditions to be reproducible, and recent publications on the association between population density and the basic reproductive number of SARS-CoV-2 are no exception. Relatively few papers share code and data sufficiently, which hinders not only verification but additional experimentation. In this article, an example of reproducible research shows the potential of spatial analysis for epidemiology research during COVID-19. Transparency and openness means that independent researchers can, with only modest efforts, verify findings and use different approaches as appropriate. Given the high stakes of the situation, it is essential that scientific findings, on which good policy depends, are as robust as possible; as the empirical example shows, reproducibility is one of the keys to ensure this.
{"title":"Reproducibility of Research During COVID-19: Examining the Case of Population Density and the Basic Reproductive Rate from the Perspective of Spatial Analysis","authors":"Antonio Paez","doi":"10.1111/gean.12307","DOIUrl":"10.1111/gean.12307","url":null,"abstract":"<p>The emergence of the novel SARS-CoV-2 coronavirus and the global COVID-19 pandemic in 2019 led to explosive growth in scientific research. Alas, much of the research in the literature lacks conditions to be reproducible, and recent publications on the association between population density and the basic reproductive number of SARS-CoV-2 are no exception. Relatively few papers share code and data sufficiently, which hinders not only verification but additional experimentation. In this article, an example of reproducible research shows the potential of spatial analysis for epidemiology research during COVID-19. Transparency and openness means that independent researchers can, with only modest efforts, verify findings and use different approaches as appropriate. Given the high stakes of the situation, it is essential that scientific findings, on which good policy depends, are as robust as possible; as the empirical example shows, reproducibility is one of the keys to ensure this.</p>","PeriodicalId":12533,"journal":{"name":"Geographical Analysis","volume":"54 4","pages":"860-880"},"PeriodicalIF":3.6,"publicationDate":"2021-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8652856/pdf/GEAN-9999-0.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39719203","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Karla Therese L. Sy, Laura F. White, Brooke E. Nichols
Reproducible research becomes even more imperative as we build the evidence base on SARS-CoV-2 epidemiology, diagnosis, prevention, and treatment. In his study, Paez assessed the reproducibility of COVID-19 research during the pandemic, using a case study of population density. He found that most articles that assess the relationship of population density and COVID-19 outcomes do not publicly share data and code, except for a few, including our paper, which he stated “illustrates the importance of good reproducibility practices”. Paez recreated our analysis using our code and data from the perspective of spatial analysis, and his new model came to a different conclusion. The disparity between our and Paez’s findings, as well as other existing literature on the topic, give greater impetus to the need for further research. As there has been near exponential growth of COVID-19 research across a wide range of scientific disciplines, reproducible science is a vital component to produce reliable, rigorous, and robust evidence on COVID-19, which will be essential to inform clinical practice and policy in order to effectively eliminate the pandemic.
{"title":"Reproducible Science Is Vital for a Stronger Evidence Base During the COVID-19 Pandemic","authors":"Karla Therese L. Sy, Laura F. White, Brooke E. Nichols","doi":"10.1111/gean.12314","DOIUrl":"10.1111/gean.12314","url":null,"abstract":"<p>Reproducible research becomes even more imperative as we build the evidence base on SARS-CoV-2 epidemiology, diagnosis, prevention, and treatment. In his study, Paez assessed the reproducibility of COVID-19 research during the pandemic, using a case study of population density. He found that most articles that assess the relationship of population density and COVID-19 outcomes do not publicly share data and code, except for a few, including our paper, which he stated “illustrates the importance of good reproducibility practices”. Paez recreated our analysis using our code and data from the perspective of spatial analysis, and his new model came to a different conclusion. The disparity between our and Paez’s findings, as well as other existing literature on the topic, give greater impetus to the need for further research. As there has been near exponential growth of COVID-19 research across a wide range of scientific disciplines, reproducible science is a vital component to produce reliable, rigorous, and robust evidence on COVID-19, which will be essential to inform clinical practice and policy in order to effectively eliminate the pandemic.</p>","PeriodicalId":12533,"journal":{"name":"Geographical Analysis","volume":"55 1","pages":"203-206"},"PeriodicalIF":3.6,"publicationDate":"2021-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8652901/pdf/GEAN-9999-0.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39719204","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This study discusses the importance of balancing spatial and non-spatial variation in spatial regression modeling. Unlike spatially varying coefficients (SVC) modeling, which is popular in spatial statistics, non-spatially varying coefficients (NVC) modeling has largely been unexplored in spatial fields. Nevertheless, as we will explain, consideration of non-spatial variation is needed not only to improve model accuracy but also to reduce spurious correlation among varying coefficients, which is a major problem in SVC modeling. We consider a Moran eigenvector approach modeling spatially and non-spatially varying coefficients (S&NVC). A Monte Carlo simulation experiment comparing our S&NVC model with existing SVC models suggests both modeling accuracy and computational efficiency for our approach. Beyond that, somewhat surprisingly, our approach identifies true and spurious correlations among coefficients nearly perfectly, even when usual SVC models suffer from severe spurious correlations. It implies that S&NVC model should be used even when the analysis purpose is modeling SVCs. Finally, our S&NVC model is employed to analyze a residential land price data set. Its results suggest existence of both spatial and non-spatial variation in regression coefficients in practice. The S&NVC model is now implemented in the R package spmoran.
{"title":"Balancing Spatial and Non-Spatial Variation in Varying Coefficient Modeling: A Remedy for Spurious Correlation","authors":"Daisuke Murakami, Daniel A. Griffith","doi":"10.1111/gean.12310","DOIUrl":"10.1111/gean.12310","url":null,"abstract":"<p>This study discusses the importance of balancing spatial and non-spatial variation in spatial regression modeling. Unlike spatially varying coefficients (SVC) modeling, which is popular in spatial statistics, non-spatially varying coefficients (NVC) modeling has largely been unexplored in spatial fields. Nevertheless, as we will explain, consideration of non-spatial variation is needed not only to improve model accuracy but also to reduce spurious correlation among varying coefficients, which is a major problem in SVC modeling. We consider a Moran eigenvector approach modeling spatially and non-spatially varying coefficients (S&NVC). A Monte Carlo simulation experiment comparing our S&NVC model with existing SVC models suggests both modeling accuracy and computational efficiency for our approach. Beyond that, somewhat surprisingly, our approach identifies true and spurious correlations among coefficients nearly perfectly, even when usual SVC models suffer from severe spurious correlations. It implies that S&NVC model should be used even when the analysis purpose is modeling SVCs. Finally, our S&NVC model is employed to analyze a residential land price data set. Its results suggest existence of both spatial and non-spatial variation in regression coefficients in practice. The S&NVC model is now implemented in the R package spmoran.</p>","PeriodicalId":12533,"journal":{"name":"Geographical Analysis","volume":"55 1","pages":"31-55"},"PeriodicalIF":3.6,"publicationDate":"2021-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/gean.12310","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44618791","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}