Pub Date : 2024-09-10DOI: 10.1007/s10109-024-00447-y
Yukio Sadahiro, Ikuho Yamada
This paper proposes a new method of point cluster analysis. There are at least three important points that we need to consider in the evaluation of point clusters. The first is spatial inhomogeneity, i.e., the inhomogeneity of locations where points can be located. The second is aspatial inhomogeneity, which indicates the inhomogeneity of point characteristics. The third is an explicit representation of the geographic scale of analysis. This paper proposes a method that considers these points in a statistical framework. We develop two measures of point clusters: local and global. The former permits us to discuss the spatial variation in point clusters, while the latter indicates the global tendency of point clusters. To test the method’s validity, this paper applies it to the analysis of hypothetical and real datasets. The results supported the soundness of the proposed method.
{"title":"Point cluster analysis using weighted random labeling","authors":"Yukio Sadahiro, Ikuho Yamada","doi":"10.1007/s10109-024-00447-y","DOIUrl":"https://doi.org/10.1007/s10109-024-00447-y","url":null,"abstract":"<p>This paper proposes a new method of point cluster analysis. There are at least three important points that we need to consider in the evaluation of point clusters. The first is spatial inhomogeneity, i.e., the inhomogeneity of locations where points can be located. The second is aspatial inhomogeneity, which indicates the inhomogeneity of point characteristics. The third is an explicit representation of the geographic scale of analysis. This paper proposes a method that considers these points in a statistical framework. We develop two measures of point clusters: local and global. The former permits us to discuss the spatial variation in point clusters, while the latter indicates the global tendency of point clusters. To test the method’s validity, this paper applies it to the analysis of hypothetical and real datasets. The results supported the soundness of the proposed method.</p>","PeriodicalId":47245,"journal":{"name":"Journal of Geographical Systems","volume":"6 1","pages":""},"PeriodicalIF":2.9,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142216657","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-04DOI: 10.1007/s10109-024-00448-x
Sophiya Gyanwali, Shashank Karki, Kee Moon Jang, Tom Crawford, Mengxi Zhang, Junghwan Kim
Recent studies on green space exposure have argued that overlooking human mobility could lead to erroneous exposure estimates and their associated inequality. However, these studies are limited as they focused on single cities and did not investigate multiple cities, which could exhibit variations in people’s mobility patterns and the spatial distribution of green spaces. Moreover, previous studies focused mainly on large-sized cities while overlooking other areas, such as small-sized cities and rural neighborhoods. In other words, it remains unclear the potential spatial non-stationarity issues in estimating green space exposure inequality. To fill these significant research gaps, we utilized commute data of 31,862 people from Virginia, West Virginia, and Kentucky. The deep learning technique was used to extract green spaces from street-view images to estimate people’s home-based and mobility-based green exposure levels. The results showed that the overall inequality in exposure levels reduced when people’s mobility was considered compared to the inequality based on home-based exposure levels, implying the neighborhood effect averaging problem (NEAP). Correlation coefficients between individual exposure levels and their social vulnerability indices demonstrated mixed and complex patterns regarding neighborhood type and size, demonstrating the presence of spatial non-stationarity. Our results underscore the crucial role of mobility in exposure assessments and the spatial non-stationarity issue when evaluating exposure inequalities. The results imply that local-specific studies are urgently needed to develop local policies to alleviate inequality in exposure precisely.
{"title":"Implications for spatial non-stationarity and the neighborhood effect averaging problem (NEAP) in green inequality research: evidence from three states in the USA","authors":"Sophiya Gyanwali, Shashank Karki, Kee Moon Jang, Tom Crawford, Mengxi Zhang, Junghwan Kim","doi":"10.1007/s10109-024-00448-x","DOIUrl":"https://doi.org/10.1007/s10109-024-00448-x","url":null,"abstract":"<p>Recent studies on green space exposure have argued that overlooking human mobility could lead to erroneous exposure estimates and their associated inequality. However, these studies are limited as they focused on single cities and did not investigate multiple cities, which could exhibit variations in people’s mobility patterns and the spatial distribution of green spaces. Moreover, previous studies focused mainly on large-sized cities while overlooking other areas, such as small-sized cities and rural neighborhoods. In other words, it remains unclear the potential spatial non-stationarity issues in estimating green space exposure inequality. To fill these significant research gaps, we utilized commute data of 31,862 people from Virginia, West Virginia, and Kentucky. The deep learning technique was used to extract green spaces from street-view images to estimate people’s home-based and mobility-based green exposure levels. The results showed that the overall inequality in exposure levels reduced when people’s mobility was considered compared to the inequality based on home-based exposure levels, implying the neighborhood effect averaging problem (NEAP). Correlation coefficients between individual exposure levels and their social vulnerability indices demonstrated mixed and complex patterns regarding neighborhood type and size, demonstrating the presence of spatial non-stationarity. Our results underscore the crucial role of mobility in exposure assessments and the spatial non-stationarity issue when evaluating exposure inequalities. The results imply that local-specific studies are urgently needed to develop local policies to alleviate inequality in exposure precisely.</p>","PeriodicalId":47245,"journal":{"name":"Journal of Geographical Systems","volume":"41 1","pages":""},"PeriodicalIF":2.9,"publicationDate":"2024-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142227038","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-03DOI: 10.1007/s10109-024-00445-0
Fernando H. Taques, Coro Chasco, Flávio H. Taques
Accessing massive datasets can be challenging for users unfamiliar with programming codes. Combining Konstanz Information Miner (KNIME) and MySQL tools on standard configuration equipment allows for addressing this issue. This research proposal aims to present a methodology that describes the necessary configuration steps in both tools and the required manipulation in KNIME to transmit the information to the MySQL environment for further processing in a database management system (DBMS). In addition, we propose a procedure so that the use of this point-and-click software in research work can gain in reproducibility and, therefore, in credibility in the scientific community. To achieve this, we will use a big database regarding patent applications as a reference, the PATSTAT Global 2023, provided by the European Patent Office (EPO). As well known, patent data can be a valuable source for understanding innovation dynamics and technological trends, whether for studies on companies, sectors, nations or even regions, at aggregated and disaggregated levels.
对于不熟悉编程代码的用户来说,访问海量数据集是一项挑战。在标准配置设备上结合康斯坦茨信息挖掘器(KNIME)和 MySQL 工具可以解决这个问题。本研究提案旨在提出一种方法,描述两种工具的必要配置步骤,以及在 KNIME 中传输信息到 MySQL 环境以便在数据库管理系统(DBMS)中进一步处理所需的操作。此外,我们还提出了一个程序,以便在研究工作中使用这种点选式软件可以提高可重复性,从而提高科学界的可信度。为此,我们将以欧洲专利局(EPO)提供的大型专利申请数据库 PATSTAT Global 2023 作为参考。众所周知,专利数据是了解创新动态和技术趋势的重要来源,无论是对公司、行业、国家甚至地区的研究,都可以从总体或分类的层面进行分析。
{"title":"Integrating big data with KNIME as an alternative without programming code: an application to the PATSTAT patent database","authors":"Fernando H. Taques, Coro Chasco, Flávio H. Taques","doi":"10.1007/s10109-024-00445-0","DOIUrl":"https://doi.org/10.1007/s10109-024-00445-0","url":null,"abstract":"<p>Accessing massive datasets can be challenging for users unfamiliar with programming codes. Combining Konstanz Information Miner (KNIME) and MySQL tools on standard configuration equipment allows for addressing this issue. This research proposal aims to present a methodology that describes the necessary configuration steps in both tools and the required manipulation in KNIME to transmit the information to the MySQL environment for further processing in a database management system (DBMS). In addition, we propose a procedure so that the use of this point-and-click software in research work can gain in reproducibility and, therefore, in credibility in the scientific community. To achieve this, we will use a big database regarding patent applications as a reference, the PATSTAT Global 2023, provided by the European Patent Office (EPO). As well known, patent data can be a valuable source for understanding innovation dynamics and technological trends, whether for studies on companies, sectors, nations or even regions, at aggregated and disaggregated levels.</p>","PeriodicalId":47245,"journal":{"name":"Journal of Geographical Systems","volume":"14 1","pages":""},"PeriodicalIF":2.9,"publicationDate":"2024-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142226969","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-17DOI: 10.1007/s10109-024-00444-1
Milad Malekzadeh, Jed A. Long
Many studies seek to study the relationship between socioeconomic factors and human mobility indicators. However, it is well documented that mobility levels are also driven by the geographical context where individual movement takes place. Here we test whether accounting for geographical context leads to new or different interpretations of human mobility behavior when studying associations with socioeconomic factors. Specifically, we define mobility deviation index as the relative level of observed mobility when compared to expected mobility for a specific location, where expected mobility accounts for geographical context. Our results highlight the significant role of context when interpreting spatial patterns of human mobility. We demonstrate that controlling for the effects of geographical context will substantially impact our interpretation of associations between measures of human mobility and socioeconomic variables. These results represent an important step in furthering our understanding of the role of place on human mobility patterns.
{"title":"Mobility deviation index: incorporating geographical context into analysis of human mobility","authors":"Milad Malekzadeh, Jed A. Long","doi":"10.1007/s10109-024-00444-1","DOIUrl":"https://doi.org/10.1007/s10109-024-00444-1","url":null,"abstract":"<p>Many studies seek to study the relationship between socioeconomic factors and human mobility indicators. However, it is well documented that mobility levels are also driven by the geographical context where individual movement takes place. Here we test whether accounting for geographical context leads to new or different interpretations of human mobility behavior when studying associations with socioeconomic factors. Specifically, we define mobility deviation index as the relative level of observed mobility when compared to expected mobility for a specific location, where expected mobility accounts for geographical context. Our results highlight the significant role of context when interpreting spatial patterns of human mobility. We demonstrate that controlling for the effects of geographical context will substantially impact our interpretation of associations between measures of human mobility and socioeconomic variables. These results represent an important step in furthering our understanding of the role of place on human mobility patterns.</p>","PeriodicalId":47245,"journal":{"name":"Journal of Geographical Systems","volume":"18 1","pages":""},"PeriodicalIF":2.9,"publicationDate":"2024-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141738242","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-02DOI: 10.1007/s10109-024-00442-3
Ghislain Geniaux
<p>Spatially varying coefficient models, such as GWR (Brunsdon et al. in Geogr Anal 28:281–298, 1996 and McMillen in J Urban Econ 40:100–124, 1996), find extensive applications across various fields, including housing markets, land use, population ecology, seismology, and mining research. These models are valuable for capturing the spatial heterogeneity of coefficient values. In many application areas, the continuous expansion of spatial data sample sizes, in terms of both volume and richness of explanatory variables, has given rise to new methodological challenges. The primary issues revolve around the time required to calculate each local coefficients and the memory requirements imposed for storing the large hat matrix (of size <span>(n times n)</span>) for parameter variance estimation. Researchers have explored various approaches to address these challenges (Harris et al. in Trans GIS 14:43–61, 2010, Pozdnoukhov and Kaiser in: Proceedings of the 19th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, 2011; Tran et al. in: 2016 Eighth International Conference on Knowledge and Systems Engineering (KSE), IEEE, 2016; Geniaux and Martinetti in Reg Sci Urban Econ 72:74–85, 2018; Li et al. in Int J Geogr Inf Sci 33:155–175, 2019; Murakami et al. in Ann Am Assoc Geogr 111:459–480, 2020). While the use of a subset of target points for local regressions has been extensively studied in nonparametric econometrics, its application within the context of GWR has been relatively unexplored. In this paper, we propose an original two-stage method designed to accelerate GWR computations. We select a subset of target points based on the spatial smoothing of residuals from a first-stage regression, conducting GWR solely on this subsample. Additionally, we propose an original approach for extrapolating coefficients to non-target points. In addition to using an effective sample of target points, we explore the computational gain provided by using truncated Gaussian kernel to create sparser matrices during computation. Our Monte Carlo experiments demonstrate that this method of target point selection outperforms methods based on point density or random selection. The results also reveal that using target points can reduce bias and root mean square error (RMSE) in estimating <span>(beta)</span> coefficients compared to traditional GWR, as it enables the selection of a more accurate bandwidth size. We demonstrate that our estimator is scalable and exhibits superior properties in this regard compared to the (Murakami et al. in Ann Am Assoc Geogr 111:459–480, 2020) estimator under two conditions: the use of a ratio of target points that provides satisfactory approximation of coefficients (10–20 % of locations) and an optimal bandwidth that remains within a reasonable neighborhood (<span>(<,)</span>5000 neighbors). All the estimator of GWR with target pointsare now accessible in the R package <i>mgwrsar</i> for GWR and Mixed GWR wit
空间变化系数模型,如 GWR(Brunsdon 等人在 Geogr Anal 28:281-298, 1996 年和 McMillen 在 J Urban Econ 40:100-124, 1996 年),广泛应用于各个领域,包括住房市场、土地利用、人口生态学、地震学和采矿研究。这些模型对于捕捉系数值的空间异质性很有价值。在许多应用领域,空间数据样本量的不断扩大,无论是在数量上还是在解释变量的丰富程度上,都带来了新的方法论挑战。主要问题围绕计算每个局部系数所需的时间,以及存储用于参数方差估计的大帽矩阵(大小为 n 次)所需的内存。研究人员探索了各种方法来应对这些挑战(Harris 等人,载于 Trans GIS 14:43-61, 2010 年;Pozdnoukhov 和 Kaiser,载于:第 19 届 ACM SIGSPATIAL 地理信息系统进展国际会议论文集,2011 年;Tran 等人在《2016 年第八届国际知识与应用大会》上的论文:2016 Eighth International Conference on Knowledge and Systems Engineering (KSE), IEEE, 2016; Geniaux and Martinetti in Reg Sci Urban Econ 72:74-85, 2018; Li et al. in Int J Geogr Inf Sci 33:155-175, 2019; Murakami et al. in Ann Am Assoc Geogr 111:459-480, 2020)。虽然在非参数计量经济学中已经对使用目标点子集进行局部回归进行了广泛研究,但其在全球地理回归中的应用却相对较少。在本文中,我们提出了一种新颖的两阶段方法,旨在加速 GWR 计算。我们根据第一阶段回归的残差的空间平滑化选择目标点子集,仅在该子样本上进行 GWR。此外,我们还提出了一种将系数外推到非目标点的独创方法。除了使用有效的目标点样本外,我们还探索了在计算过程中使用截断高斯核创建稀疏矩阵所带来的计算增益。我们的蒙特卡罗实验证明,这种目标点选择方法优于基于点密度或随机选择的方法。结果还显示,与传统的 GWR 相比,使用目标点可以减少估计系数的偏差和均方根误差(RMSE),因为它可以选择更精确的带宽大小。我们证明,在两个条件下,我们的估计器是可扩展的,并且与(Murakami et al. in Ann Am Assoc Geogr 111:459-480, 2020)估计器相比,在这方面表现出更优越的特性:使用能提供令人满意的系数近似值的目标点比例(10%-20%的位置),以及保持在合理邻域((<,)5000个邻域)内的最佳带宽。带有目标点的 GWR 的所有估计方法现在都可以在 R 软件包 mgwrsar 中访问,该软件包用于 GWR 和带有或不带有空间自相关性的混合 GWR,可在 CRAN 存储库(https://CRAN.R-project.org/package=mgwrsar)中访问。
{"title":"Speeding up estimation of spatially varying coefficients models","authors":"Ghislain Geniaux","doi":"10.1007/s10109-024-00442-3","DOIUrl":"https://doi.org/10.1007/s10109-024-00442-3","url":null,"abstract":"<p>Spatially varying coefficient models, such as GWR (Brunsdon et al. in Geogr Anal 28:281–298, 1996 and McMillen in J Urban Econ 40:100–124, 1996), find extensive applications across various fields, including housing markets, land use, population ecology, seismology, and mining research. These models are valuable for capturing the spatial heterogeneity of coefficient values. In many application areas, the continuous expansion of spatial data sample sizes, in terms of both volume and richness of explanatory variables, has given rise to new methodological challenges. The primary issues revolve around the time required to calculate each local coefficients and the memory requirements imposed for storing the large hat matrix (of size <span>(n times n)</span>) for parameter variance estimation. Researchers have explored various approaches to address these challenges (Harris et al. in Trans GIS 14:43–61, 2010, Pozdnoukhov and Kaiser in: Proceedings of the 19th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, 2011; Tran et al. in: 2016 Eighth International Conference on Knowledge and Systems Engineering (KSE), IEEE, 2016; Geniaux and Martinetti in Reg Sci Urban Econ 72:74–85, 2018; Li et al. in Int J Geogr Inf Sci 33:155–175, 2019; Murakami et al. in Ann Am Assoc Geogr 111:459–480, 2020). While the use of a subset of target points for local regressions has been extensively studied in nonparametric econometrics, its application within the context of GWR has been relatively unexplored. In this paper, we propose an original two-stage method designed to accelerate GWR computations. We select a subset of target points based on the spatial smoothing of residuals from a first-stage regression, conducting GWR solely on this subsample. Additionally, we propose an original approach for extrapolating coefficients to non-target points. In addition to using an effective sample of target points, we explore the computational gain provided by using truncated Gaussian kernel to create sparser matrices during computation. Our Monte Carlo experiments demonstrate that this method of target point selection outperforms methods based on point density or random selection. The results also reveal that using target points can reduce bias and root mean square error (RMSE) in estimating <span>(beta)</span> coefficients compared to traditional GWR, as it enables the selection of a more accurate bandwidth size. We demonstrate that our estimator is scalable and exhibits superior properties in this regard compared to the (Murakami et al. in Ann Am Assoc Geogr 111:459–480, 2020) estimator under two conditions: the use of a ratio of target points that provides satisfactory approximation of coefficients (10–20 % of locations) and an optimal bandwidth that remains within a reasonable neighborhood (<span>(<,)</span>5000 neighbors). All the estimator of GWR with target pointsare now accessible in the R package <i>mgwrsar</i> for GWR and Mixed GWR wit","PeriodicalId":47245,"journal":{"name":"Journal of Geographical Systems","volume":"73 1","pages":""},"PeriodicalIF":2.9,"publicationDate":"2024-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141504086","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-06-19DOI: 10.1007/s10109-024-00443-2
Alan T. Murray, Luc Anselin, Sergio J. Rey
The passing of Professor Arthur Getis in May of 2022 initiated a number of events to both reflect on and remember the tremendous contributions he has made over his career to geographical systems more broadly, but also spatial analysis, spatial statistics, regional science, geography and GIScience, among others. This began with a series of sessions at the North American Regional Science Council meetings held in Montreal, Canada in November of 2022, followed by an invitation from the editors of Journal of Geographical Systems, Manfred Fischer, Antonio Paez, and Petra Staufer-Steinnocher, to organize a special issue in his honor. Soon thereafter, an open call was initiated to solicit submissions to this special issue, seeking original contributions that overlap and complement the broad range of research undertaken by Professor Getis over a distinguished career and life. This paper offers an overview of prominent geographical systems work carried out by Professor Getis. Reflections on his many contributions are also detailed. A summary of the contributions to this special issue is given, along with final thoughts.
{"title":"Arthur Getis: a legend in geographical systems","authors":"Alan T. Murray, Luc Anselin, Sergio J. Rey","doi":"10.1007/s10109-024-00443-2","DOIUrl":"https://doi.org/10.1007/s10109-024-00443-2","url":null,"abstract":"<p>The passing of Professor Arthur Getis in May of 2022 initiated a number of events to both reflect on and remember the tremendous contributions he has made over his career to geographical systems more broadly, but also spatial analysis, spatial statistics, regional science, geography and GIScience, among others. This began with a series of sessions at the North American Regional Science Council meetings held in Montreal, Canada in November of 2022, followed by an invitation from the editors of <i>Journal of Geographical Systems</i>, Manfred Fischer, Antonio Paez, and Petra Staufer-Steinnocher, to organize a special issue in his honor. Soon thereafter, an open call was initiated to solicit submissions to this special issue, seeking original contributions that overlap and complement the broad range of research undertaken by Professor Getis over a distinguished career and life. This paper offers an overview of prominent geographical systems work carried out by Professor Getis. Reflections on his many contributions are also detailed. A summary of the contributions to this special issue is given, along with final thoughts.</p>","PeriodicalId":47245,"journal":{"name":"Journal of Geographical Systems","volume":"11 1","pages":""},"PeriodicalIF":2.9,"publicationDate":"2024-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141504085","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-18DOI: 10.1007/s10109-024-00440-5
J. Paul Elhorst, Ioanna Tziolas, Chang Tan, Petros Milionis
This paper quantifies and graphically illustrates the distance decay effect and spatial reach of spillover effects derived from a spatial Durbin (SD) model with parameterized spatial weight matrices. Building on attributes of the concept of spatial autocorrelation developed by Arthur Getis, we adopt a distance-based negative exponential spatial weight matrix and parameterize it by a decay parameter that is different for each spatial lag in this model, both of the regressand and of all regressors. The quantification and illustration are applied to the spatially augmented neoclassical growth framework, which we estimate using data for 266 NUTS-2 regions in the EU over the period 2000–2018. We find distance decay parameters ranging from 0.233 to 2.224 and spatial reaches ranging from 700 to more than 1500 km for the different growth determinants in this model. These wide ranges highlight the restrictiveness of the conventional SD model based on one common spatial weight matrix for all spatial lags.
{"title":"The distance decay effect and spatial reach of spillovers","authors":"J. Paul Elhorst, Ioanna Tziolas, Chang Tan, Petros Milionis","doi":"10.1007/s10109-024-00440-5","DOIUrl":"https://doi.org/10.1007/s10109-024-00440-5","url":null,"abstract":"<p>This paper quantifies and graphically illustrates the distance decay effect and spatial reach of spillover effects derived from a spatial Durbin (SD) model with parameterized spatial weight matrices. Building on attributes of the concept of spatial autocorrelation developed by Arthur Getis, we adopt a distance-based negative exponential spatial weight matrix and parameterize it by a decay parameter that is different for each spatial lag in this model, both of the regressand and of all regressors. The quantification and illustration are applied to the spatially augmented neoclassical growth framework, which we estimate using data for 266 NUTS-2 regions in the EU over the period 2000–2018. We find distance decay parameters ranging from 0.233 to 2.224 and spatial reaches ranging from 700 to more than 1500 km for the different growth determinants in this model. These wide ranges highlight the restrictiveness of the conventional SD model based on one common spatial weight matrix for all spatial lags.</p>","PeriodicalId":47245,"journal":{"name":"Journal of Geographical Systems","volume":"47 1","pages":""},"PeriodicalIF":2.9,"publicationDate":"2024-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141063276","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Spatial flows represent spatial interactions or movements. Mining colocation patterns of different types of flows may uncover the spatial dependences and associations among flows. Previous studies proposed a flow colocation pattern mining method and established a significance test under the null hypothesis of independence for the results. In fact, the definition of the null hypothesis is crucial in significance testing. Choosing an inappropriate null hypothesis may lead to misunderstandings about the spatial interactions between flows. In practice, the overall distribution patterns of different types of flows may be clustered. In these cases, the null hypothesis of independence will result in unconvincing results. Thus, considering the overall spatial pattern of flows, in this study, we changed the null hypothesis to random labeling to establish the statistical significance of flow colocation patterns. Furthermore, we compared and analyzed the impacts of different null hypotheses on flow colocation pattern mining through synthetic data tests with different preset patterns and situations. Additionally, we used empirical data from ride-hailing trips to show the practicality of the method.
{"title":"Rethinking the null hypothesis in significant colocation pattern mining of spatial flows","authors":"Mengjie Zhou, Mengjie Yang, Tinghua Ai, Jiannan Cai, Zhe Chen","doi":"10.1007/s10109-024-00439-y","DOIUrl":"https://doi.org/10.1007/s10109-024-00439-y","url":null,"abstract":"<p>Spatial flows represent spatial interactions or movements. Mining colocation patterns of different types of flows may uncover the spatial dependences and associations among flows. Previous studies proposed a flow colocation pattern mining method and established a significance test under the null hypothesis of independence for the results. In fact, the definition of the null hypothesis is crucial in significance testing. Choosing an inappropriate null hypothesis may lead to misunderstandings about the spatial interactions between flows. In practice, the overall distribution patterns of different types of flows may be clustered. In these cases, the null hypothesis of independence will result in unconvincing results. Thus, considering the overall spatial pattern of flows, in this study, we changed the null hypothesis to random labeling to establish the statistical significance of flow colocation patterns. Furthermore, we compared and analyzed the impacts of different null hypotheses on flow colocation pattern mining through synthetic data tests with different preset patterns and situations. Additionally, we used empirical data from ride-hailing trips to show the practicality of the method.</p>","PeriodicalId":47245,"journal":{"name":"Journal of Geographical Systems","volume":"11 1","pages":""},"PeriodicalIF":2.9,"publicationDate":"2024-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140889486","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-02-22DOI: 10.1007/s10109-024-00438-z
Manfred M. Fischer, Antonio Paez, Petra Staufer-Steinnocher
{"title":"2023 JGS best paper award and the editors’ choice paper volume 26(1)","authors":"Manfred M. Fischer, Antonio Paez, Petra Staufer-Steinnocher","doi":"10.1007/s10109-024-00438-z","DOIUrl":"https://doi.org/10.1007/s10109-024-00438-z","url":null,"abstract":"","PeriodicalId":47245,"journal":{"name":"Journal of Geographical Systems","volume":"162 1","pages":""},"PeriodicalIF":2.9,"publicationDate":"2024-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140889280","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-01-25DOI: 10.1007/s10109-023-00435-8
Batuhan Kilic, Onur Can Bayrak, Fatih Gülgen, Mert Gurturk, Perihan Abay
In today's era, the address plays a crucial role as one of the key components that enable mobility in daily life. Address data are used by global map platforms and location-based services to pinpoint a geographically referenced location. Geocoding provided by online platforms is useful in the spatial tracking of reported cases and controls in the spatial analysis of infectious illnesses such as COVID-19. The first and most critical phase in the geocoding process is address matching. However, due to typographical errors, variations in abbreviations used, and incomplete or malformed addresses, the matching can seldom be performed with 100% accuracy. The purpose of this research is to examine the capabilities of machine learning classifiers that can be used to measure the consistency of address matching results produced by online geocoding services and to identify the best performing classifier. The performance of the seven machine learning classifiers was compared using several text similarity measures, which assess the match scores between the input address data and the services' output. The data utilized in the testing came from four distinct online geocoding services applied to 925 addresses in Türkiye. The findings from this study revealed that the Random Forest machine learning classifier was the most accurate in the address matching procedure. While the results of this study hold true for similar datasets in Türkiye, additional research is required to determine whether they apply to data in other countries.
{"title":"Unveiling the impact of machine learning algorithms on the quality of online geocoding services: a case study using COVID-19 data","authors":"Batuhan Kilic, Onur Can Bayrak, Fatih Gülgen, Mert Gurturk, Perihan Abay","doi":"10.1007/s10109-023-00435-8","DOIUrl":"https://doi.org/10.1007/s10109-023-00435-8","url":null,"abstract":"<p>In today's era, the address plays a crucial role as one of the key components that enable mobility in daily life. Address data are used by global map platforms and location-based services to pinpoint a geographically referenced location. Geocoding provided by online platforms is useful in the spatial tracking of reported cases and controls in the spatial analysis of infectious illnesses such as COVID-19. The first and most critical phase in the geocoding process is address matching. However, due to typographical errors, variations in abbreviations used, and incomplete or malformed addresses, the matching can seldom be performed with 100% accuracy. The purpose of this research is to examine the capabilities of machine learning classifiers that can be used to measure the consistency of address matching results produced by online geocoding services and to identify the best performing classifier. The performance of the seven machine learning classifiers was compared using several text similarity measures, which assess the match scores between the input address data and the services' output. The data utilized in the testing came from four distinct online geocoding services applied to 925 addresses in Türkiye. The findings from this study revealed that the Random Forest machine learning classifier was the most accurate in the address matching procedure. While the results of this study hold true for similar datasets in Türkiye, additional research is required to determine whether they apply to data in other countries.</p>","PeriodicalId":47245,"journal":{"name":"Journal of Geographical Systems","volume":"255 1","pages":""},"PeriodicalIF":2.9,"publicationDate":"2024-01-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139554283","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}