Journal of Geographical Systems最新文献

County-to-county migration modeling in the United States: the effects of data source and model selection. 美国县际迁移建模：数据源和模型选择的影响。

IF 2.9 3区地球科学 Q1 GEOGRAPHY

Journal of Geographical Systems

Pub Date : 2025-07-01 Epub Date: 2025-07-11 DOI: 10.1007/s10109-025-00470-7

Philip E Morefield, Timothy F Leslie

Internal migration plays a critical role in shaping demographic and economic landscapes, yet the ability to model migration flows accurately remains a methodological challenge. This study evaluates the performance of different migration models applied to three key U.S. data sources: the Internal Revenue Service (IRS) migration data, the American Community Survey (ACS), and the Census long-form data. While these datasets provide valuable insights into county-to-county migration, they differ in temporal coverage, flow suppression thresholds, and demographic granularity, each introducing unique challenges to migration modeling. Using a comparative framework, this study assesses the impact of data source selection on the accuracy and bias of widely used migration models, including the gravity model, Poisson regression, and the radiation model. Our findings highlight the trade-offs inherent in each dataset, demonstrating that IRS data yield lower prediction errors in aggregate flow estimates but lack demographic specificity, whereas ACS and Census data offer richer demographic detail and capture a larger number of distinct migration streams, though they may introduce noise due to small-flow estimates and suppression thresholds for confidentiality. The results underscore the importance of aligning data selection with research objectives and contribute to broader discussions on best practices for migration modeling.

国内移徙在塑造人口和经济格局方面发挥着关键作用，但能否准确模拟移徙流动仍然是一项方法论挑战。本研究评估了应用于美国三个关键数据来源的不同移民模型的表现：美国国税局（IRS）移民数据、美国社区调查（ACS）和人口普查长期数据。虽然这些数据集提供了对县与县之间迁移的有价值的见解，但它们在时间覆盖范围、流量抑制阈值和人口统计粒度方面存在差异，每个数据集都给迁移建模带来了独特的挑战。本研究采用比较框架，评估了数据源选择对重力模型、泊松回归模型和辐射模型等常用迁移模型精度和偏差的影响。我们的研究结果强调了每个数据集固有的权衡，表明IRS数据在总流量估计中产生较低的预测误差，但缺乏人口统计学特异性，而ACS和Census数据提供了更丰富的人口统计学细节，并捕获了大量不同的迁移流，尽管它们可能会由于小流量估计和保密抑制阈值而引入噪声。结果强调了将数据选择与研究目标结合起来的重要性，并有助于对迁移建模的最佳实践进行更广泛的讨论。

{"title":"County-to-county migration modeling in the United States: the effects of data source and model selection.","authors":"Philip E Morefield, Timothy F Leslie","doi":"10.1007/s10109-025-00470-7","DOIUrl":"10.1007/s10109-025-00470-7","url":null,"abstract":"Internal migration plays a critical role in shaping demographic and economic landscapes, yet the ability to model migration flows accurately remains a methodological challenge. This study evaluates the performance of different migration models applied to three key U.S. data sources: the Internal Revenue Service (IRS) migration data, the American Community Survey (ACS), and the Census long-form data. While these datasets provide valuable insights into county-to-county migration, they differ in temporal coverage, flow suppression thresholds, and demographic granularity, each introducing unique challenges to migration modeling. Using a comparative framework, this study assesses the impact of data source selection on the accuracy and bias of widely used migration models, including the gravity model, Poisson regression, and the radiation model. Our findings highlight the trade-offs inherent in each dataset, demonstrating that IRS data yield lower prediction errors in aggregate flow estimates but lack demographic specificity, whereas ACS and Census data offer richer demographic detail and capture a larger number of distinct migration streams, though they may introduce noise due to small-flow estimates and suppression thresholds for confidentiality. The results underscore the importance of aligning data selection with research objectives and contribute to broader discussions on best practices for migration modeling.","PeriodicalId":47245,"journal":{"name":"Journal of Geographical Systems","volume":"27 3","pages":"455-472"},"PeriodicalIF":2.9,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12774388/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145919138","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Leveraging principal component analysis to uncover urban pedestrian dynamics. 利用主成分分析揭示城市行人动态。

IF 2.9 3区地球科学 Q1 GEOGRAPHY

Journal of Geographical Systems

Pub Date : 2025-01-01 Epub Date: 2025-06-10 DOI: 10.1007/s10109-025-00469-0

Jack Liddle, Wenhua Jiang, Nick Malleson

As the world rapidly urbanises and cities become larger and more complex, understanding pedestrian dynamics is paramount. New data sources, particularly those that measure pedestrian counts (i.e. 'footfall'), offer potential as a means of better understanding the fundamental spatio-temporal structures that characterise aggregate pedestrian behaviour. However, footfall data are often complex and influenced by a wide range of social, spatial and temporal factors, which complicates interpretation. This paper applies principal component analysis (PCA) to hourly pedestrian count data from Melbourne, Australia, to extract the key temporal signatures that underpin observed urban footfall patterns. PCA can reduce the dimensionality of noisy pedestrian flow data, revealing dominant activity patterns such as weekday commuting cycles and weekend leisure activities. By subsequently analysing pedestrian volumes through the lens of these components, we start to expose the underlying types of pedestrian activities that characterise different neighbourhoods. In addition, we can distinguish multiple overlapping activity patterns within a single location, identifying changes in urban functionality and detecting shifts in mobility trends. The impacts of external shocks, such as the COVID-19 pandemic, are particularly stark. These findings shed light on the intricacies of urban mobility and suggest that there is value in the use of PCA as a means to better understand urban dynamics.

随着世界快速城市化，城市变得越来越大，越来越复杂，了解行人动态至关重要。新的数据来源，特别是那些测量行人数量的数据来源。（footfall），提供了一种更好地理解行人总体行为特征的基本时空结构的方法。然而，人流量数据往往是复杂的，并受到广泛的社会、空间和时间因素的影响，这使得解释变得复杂。本文应用主成分分析（PCA）对来自澳大利亚墨尔本的每小时行人计数数据进行分析，以提取支撑观察到的城市步行模式的关键时间特征。主成分分析可以降低噪声行人流量数据的维数，揭示工作日通勤周期和周末休闲活动等主导活动模式。随后，通过这些组成部分分析行人数量，我们开始揭示不同社区特征的潜在行人活动类型。此外，我们还可以在一个地点内区分多个重叠的活动模式，识别城市功能的变化并检测移动趋势的变化。2019冠状病毒病大流行等外部冲击的影响尤为明显。这些发现揭示了城市流动性的复杂性，并表明使用PCA作为更好地理解城市动态的手段是有价值的。

{"title":"Leveraging principal component analysis to uncover urban pedestrian dynamics.","authors":"Jack Liddle, Wenhua Jiang, Nick Malleson","doi":"10.1007/s10109-025-00469-0","DOIUrl":"10.1007/s10109-025-00469-0","url":null,"abstract":"As the world rapidly urbanises and cities become larger and more complex, understanding pedestrian dynamics is paramount. New data sources, particularly those that measure pedestrian counts (i.e. 'footfall'), offer potential as a means of better understanding the fundamental spatio-temporal structures that characterise aggregate pedestrian behaviour. However, footfall data are often complex and influenced by a wide range of social, spatial and temporal factors, which complicates interpretation. This paper applies principal component analysis (PCA) to hourly pedestrian count data from Melbourne, Australia, to extract the key temporal signatures that underpin observed urban footfall patterns. PCA can reduce the dimensionality of noisy pedestrian flow data, revealing dominant activity patterns such as weekday commuting cycles and weekend leisure activities. By subsequently analysing pedestrian volumes through the lens of these components, we start to expose the underlying types of pedestrian activities that characterise different neighbourhoods. In addition, we can distinguish multiple overlapping activity patterns within a single location, identifying changes in urban functionality and detecting shifts in mobility trends. The impacts of external shocks, such as the COVID-19 pandemic, are particularly stark. These findings shed light on the intricacies of urban mobility and suggest that there is value in the use of PCA as a means to better understand urban dynamics.","PeriodicalId":47245,"journal":{"name":"Journal of Geographical Systems","volume":"27 3","pages":"425-453"},"PeriodicalIF":2.9,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12425851/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145065972","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Point cluster analysis using weighted random labeling 使用加权随机标记的点聚类分析

IF 2.9 3区地球科学 Q1 GEOGRAPHY

Journal of Geographical Systems

Pub Date : 2024-09-10 DOI: 10.1007/s10109-024-00447-y

Yukio Sadahiro, Ikuho Yamada

This paper proposes a new method of point cluster analysis. There are at least three important points that we need to consider in the evaluation of point clusters. The first is spatial inhomogeneity, i.e., the inhomogeneity of locations where points can be located. The second is aspatial inhomogeneity, which indicates the inhomogeneity of point characteristics. The third is an explicit representation of the geographic scale of analysis. This paper proposes a method that considers these points in a statistical framework. We develop two measures of point clusters: local and global. The former permits us to discuss the spatial variation in point clusters, while the latter indicates the global tendency of point clusters. To test the method’s validity, this paper applies it to the analysis of hypothetical and real datasets. The results supported the soundness of the proposed method.

本文提出了一种新的点聚类分析方法。在评估点聚类时，我们至少需要考虑三个要点。首先是空间不均匀性，即点所在位置的不均匀性。第二是空间不均匀性，即点特征的不均匀性。第三种是明确表示分析的地理尺度。本文提出了一种在统计框架下考虑这些点的方法。我们开发了两种测量点集群的方法：局部集群和全局集群。前者允许我们讨论点群的空间变化，后者则表明点群的全球趋势。为了检验该方法的有效性，本文将其应用于假设数据集和真实数据集的分析。结果证明了所提方法的合理性。

引用次数: 0

Implications for spatial non-stationarity and the neighborhood effect averaging problem (NEAP) in green inequality research: evidence from three states in the USA 绿色不平等研究中的空间非平稳性和邻里效应平均问题（NEAP）的影响：来自美国三个州的证据

IF 2.9 3区地球科学 Q1 GEOGRAPHY

Journal of Geographical Systems

Pub Date : 2024-09-04 DOI: 10.1007/s10109-024-00448-x

Sophiya Gyanwali, Shashank Karki, Kee Moon Jang, Tom Crawford, Mengxi Zhang, Junghwan Kim

Recent studies on green space exposure have argued that overlooking human mobility could lead to erroneous exposure estimates and their associated inequality. However, these studies are limited as they focused on single cities and did not investigate multiple cities, which could exhibit variations in people’s mobility patterns and the spatial distribution of green spaces. Moreover, previous studies focused mainly on large-sized cities while overlooking other areas, such as small-sized cities and rural neighborhoods. In other words, it remains unclear the potential spatial non-stationarity issues in estimating green space exposure inequality. To fill these significant research gaps, we utilized commute data of 31,862 people from Virginia, West Virginia, and Kentucky. The deep learning technique was used to extract green spaces from street-view images to estimate people’s home-based and mobility-based green exposure levels. The results showed that the overall inequality in exposure levels reduced when people’s mobility was considered compared to the inequality based on home-based exposure levels, implying the neighborhood effect averaging problem (NEAP). Correlation coefficients between individual exposure levels and their social vulnerability indices demonstrated mixed and complex patterns regarding neighborhood type and size, demonstrating the presence of spatial non-stationarity. Our results underscore the crucial role of mobility in exposure assessments and the spatial non-stationarity issue when evaluating exposure inequalities. The results imply that local-specific studies are urgently needed to develop local policies to alleviate inequality in exposure precisely.

最近关于绿地暴露的研究认为，忽视人的流动性可能会导致错误的暴露估计及其相关的不平等。然而，这些研究存在局限性，因为它们只关注单个城市，而没有调查多个城市，而多个城市的人们流动模式和绿地的空间分布可能存在差异。此外，以往的研究主要集中在大型城市，而忽略了其他地区，如小型城市和农村社区。换句话说，在估算绿地暴露不平等时，潜在的空间非平稳性问题仍不明确。为了填补这些重要的研究空白，我们利用了弗吉尼亚州、西弗吉尼亚州和肯塔基州 31,862 人的通勤数据。我们使用深度学习技术从街景图像中提取绿地，以估算人们基于家庭和流动的绿地暴露水平。结果表明，与基于家庭的暴露水平不平等相比，当考虑到人们的流动性时，暴露水平的总体不平等程度降低了，这意味着邻里效应平均问题（NEAP）。个人暴露水平与其社会脆弱性指数之间的相关系数显示出与邻里类型和规模有关的混合而复杂的模式，表明存在空间非平稳性。我们的研究结果强调了流动性在暴露评估中的关键作用，以及在评估暴露不平等时的空间非平稳性问题。这些结果表明，亟需开展针对地方的研究，以制定地方政策，准确缓解暴露不平等问题。

{"title":"Implications for spatial non-stationarity and the neighborhood effect averaging problem (NEAP) in green inequality research: evidence from three states in the USA","authors":"Sophiya Gyanwali, Shashank Karki, Kee Moon Jang, Tom Crawford, Mengxi Zhang, Junghwan Kim","doi":"10.1007/s10109-024-00448-x","DOIUrl":"https://doi.org/10.1007/s10109-024-00448-x","url":null,"abstract":"Recent studies on green space exposure have argued that overlooking human mobility could lead to erroneous exposure estimates and their associated inequality. However, these studies are limited as they focused on single cities and did not investigate multiple cities, which could exhibit variations in people’s mobility patterns and the spatial distribution of green spaces. Moreover, previous studies focused mainly on large-sized cities while overlooking other areas, such as small-sized cities and rural neighborhoods. In other words, it remains unclear the potential spatial non-stationarity issues in estimating green space exposure inequality. To fill these significant research gaps, we utilized commute data of 31,862 people from Virginia, West Virginia, and Kentucky. The deep learning technique was used to extract green spaces from street-view images to estimate people’s home-based and mobility-based green exposure levels. The results showed that the overall inequality in exposure levels reduced when people’s mobility was considered compared to the inequality based on home-based exposure levels, implying the neighborhood effect averaging problem (NEAP). Correlation coefficients between individual exposure levels and their social vulnerability indices demonstrated mixed and complex patterns regarding neighborhood type and size, demonstrating the presence of spatial non-stationarity. Our results underscore the crucial role of mobility in exposure assessments and the spatial non-stationarity issue when evaluating exposure inequalities. The results imply that local-specific studies are urgently needed to develop local policies to alleviate inequality in exposure precisely.","PeriodicalId":47245,"journal":{"name":"Journal of Geographical Systems","volume":"41 1","pages":""},"PeriodicalIF":2.9,"publicationDate":"2024-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142227038","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Integrating big data with KNIME as an alternative without programming code: an application to the PATSTAT patent database 用 KNIME 整合大数据，作为无需编程代码的替代方法：在 PATSTAT 专利数据库中的应用

IF 2.9 3区地球科学 Q1 GEOGRAPHY

Journal of Geographical Systems

Pub Date : 2024-09-03 DOI: 10.1007/s10109-024-00445-0

Fernando H. Taques, Coro Chasco, Flávio H. Taques

Accessing massive datasets can be challenging for users unfamiliar with programming codes. Combining Konstanz Information Miner (KNIME) and MySQL tools on standard configuration equipment allows for addressing this issue. This research proposal aims to present a methodology that describes the necessary configuration steps in both tools and the required manipulation in KNIME to transmit the information to the MySQL environment for further processing in a database management system (DBMS). In addition, we propose a procedure so that the use of this point-and-click software in research work can gain in reproducibility and, therefore, in credibility in the scientific community. To achieve this, we will use a big database regarding patent applications as a reference, the PATSTAT Global 2023, provided by the European Patent Office (EPO). As well known, patent data can be a valuable source for understanding innovation dynamics and technological trends, whether for studies on companies, sectors, nations or even regions, at aggregated and disaggregated levels.

对于不熟悉编程代码的用户来说，访问海量数据集是一项挑战。在标准配置设备上结合康斯坦茨信息挖掘器（KNIME）和 MySQL 工具可以解决这个问题。本研究提案旨在提出一种方法，描述两种工具的必要配置步骤，以及在 KNIME 中传输信息到 MySQL 环境以便在数据库管理系统（DBMS）中进一步处理所需的操作。此外，我们还提出了一个程序，以便在研究工作中使用这种点选式软件可以提高可重复性，从而提高科学界的可信度。为此，我们将以欧洲专利局（EPO）提供的大型专利申请数据库 PATSTAT Global 2023 作为参考。众所周知，专利数据是了解创新动态和技术趋势的重要来源，无论是对公司、行业、国家甚至地区的研究，都可以从总体或分类的层面进行分析。

引用次数: 0

Mobility deviation index: incorporating geographical context into analysis of human mobility 流动性偏差指数：将地理环境纳入人类流动性分析

IF 2.9 3区地球科学 Q1 GEOGRAPHY

Journal of Geographical Systems

Pub Date : 2024-07-17 DOI: 10.1007/s10109-024-00444-1

Milad Malekzadeh, Jed A. Long

Many studies seek to study the relationship between socioeconomic factors and human mobility indicators. However, it is well documented that mobility levels are also driven by the geographical context where individual movement takes place. Here we test whether accounting for geographical context leads to new or different interpretations of human mobility behavior when studying associations with socioeconomic factors. Specifically, we define mobility deviation index as the relative level of observed mobility when compared to expected mobility for a specific location, where expected mobility accounts for geographical context. Our results highlight the significant role of context when interpreting spatial patterns of human mobility. We demonstrate that controlling for the effects of geographical context will substantially impact our interpretation of associations between measures of human mobility and socioeconomic variables. These results represent an important step in furthering our understanding of the role of place on human mobility patterns.

许多研究试图研究社会经济因素与人口流动指标之间的关系。然而，有资料表明，流动水平也受个人流动所处的地理环境的影响。在此，我们检验了在研究与社会经济因素的关联时，考虑地理环境是否会对人类流动行为产生新的或不同的解释。具体来说，我们将流动性偏差指数定义为特定地点观察到的流动性与预期流动性相比的相对水平，其中预期流动性考虑了地理环境因素。我们的研究结果凸显了在解释人类流动的空间模式时，环境所起的重要作用。我们证明，控制地理环境的影响将极大地影响我们对人口流动性和社会经济变量之间关联的解释。这些结果标志着我们在进一步理解地点对人类流动模式的作用方面迈出了重要一步。

引用次数: 0

Speeding up estimation of spatially varying coefficients models 加快空间变化系数模型的估算速度

IF 2.9 3区地球科学 Q1 GEOGRAPHY

Journal of Geographical Systems

Pub Date : 2024-07-02 DOI: 10.1007/s10109-024-00442-3

Ghislain Geniaux

Spatially varying coefficient models, such as GWR (Brunsdon et al. in Geogr Anal 28:281–298, 1996 and McMillen in J Urban Econ 40:100–124, 1996), find extensive applications across various fields, including housing markets, land use, population ecology, seismology, and mining research. These models are valuable for capturing the spatial heterogeneity of coefficient values. In many application areas, the continuous expansion of spatial data sample sizes, in terms of both volume and richness of explanatory variables, has given rise to new methodological challenges. The primary issues revolve around the time required to calculate each local coefficients and the memory requirements imposed for storing the large hat matrix (of size (n times n)) for parameter variance estimation. Researchers have explored various approaches to address these challenges (Harris et al. in Trans GIS 14:43–61, 2010, Pozdnoukhov and Kaiser in: Proceedings of the 19th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, 2011; Tran et al. in: 2016 Eighth International Conference on Knowledge and Systems Engineering (KSE), IEEE, 2016; Geniaux and Martinetti in Reg Sci Urban Econ 72:74–85, 2018; Li et al. in Int J Geogr Inf Sci 33:155–175, 2019; Murakami et al. in Ann Am Assoc Geogr 111:459–480, 2020). While the use of a subset of target points for local regressions has been extensively studied in nonparametric econometrics, its application within the context of GWR has been relatively unexplored. In this paper, we propose an original two-stage method designed to accelerate GWR computations. We select a subset of target points based on the spatial smoothing of residuals from a first-stage regression, conducting GWR solely on this subsample. Additionally, we propose an original approach for extrapolating coefficients to non-target points. In addition to using an effective sample of target points, we explore the computational gain provided by using truncated Gaussian kernel to create sparser matrices during computation. Our Monte Carlo experiments demonstrate that this method of target point selection outperforms methods based on point density or random selection. The results also reveal that using target points can reduce bias and root mean square error (RMSE) in estimating (beta) coefficients compared to traditional GWR, as it enables the selection of a more accurate bandwidth size. We demonstrate that our estimator is scalable and exhibits superior properties in this regard compared to the (Murakami et al. in Ann Am Assoc Geogr 111:459–480, 2020) estimator under two conditions: the use of a ratio of target points that provides satisfactory approximation of coefficients (10–20 % of locations) and an optimal bandwidth that remains within a reasonable neighborhood ((<,)5000 neighbors). All the estimator of GWR with target pointsare now accessible in the R package mgwrsar for GWR and Mixed GWR wit

空间变化系数模型，如 GWR（Brunsdon 等人在 Geogr Anal 28:281-298, 1996 年和 McMillen 在 J Urban Econ 40:100-124, 1996 年），广泛应用于各个领域，包括住房市场、土地利用、人口生态学、地震学和采矿研究。这些模型对于捕捉系数值的空间异质性很有价值。在许多应用领域，空间数据样本量的不断扩大，无论是在数量上还是在解释变量的丰富程度上，都带来了新的方法论挑战。主要问题围绕计算每个局部系数所需的时间，以及存储用于参数方差估计的大帽矩阵（大小为 n 次）所需的内存。研究人员探索了各种方法来应对这些挑战（Harris 等人，载于 Trans GIS 14:43-61, 2010 年；Pozdnoukhov 和 Kaiser，载于：第 19 届 ACM SIGSPATIAL 地理信息系统进展国际会议论文集，2011 年；Tran 等人在《2016 年第八届国际知识与应用大会》上的论文：2016 Eighth International Conference on Knowledge and Systems Engineering (KSE), IEEE, 2016; Geniaux and Martinetti in Reg Sci Urban Econ 72:74-85, 2018; Li et al. in Int J Geogr Inf Sci 33:155-175, 2019; Murakami et al. in Ann Am Assoc Geogr 111:459-480, 2020）。虽然在非参数计量经济学中已经对使用目标点子集进行局部回归进行了广泛研究，但其在全球地理回归中的应用却相对较少。在本文中，我们提出了一种新颖的两阶段方法，旨在加速 GWR 计算。我们根据第一阶段回归的残差的空间平滑化选择目标点子集，仅在该子样本上进行 GWR。此外，我们还提出了一种将系数外推到非目标点的独创方法。除了使用有效的目标点样本外，我们还探索了在计算过程中使用截断高斯核创建稀疏矩阵所带来的计算增益。我们的蒙特卡罗实验证明，这种目标点选择方法优于基于点密度或随机选择的方法。结果还显示，与传统的 GWR 相比，使用目标点可以减少估计系数的偏差和均方根误差（RMSE），因为它可以选择更精确的带宽大小。我们证明，在两个条件下，我们的估计器是可扩展的，并且与（Murakami et al. in Ann Am Assoc Geogr 111:459-480, 2020）估计器相比，在这方面表现出更优越的特性：使用能提供令人满意的系数近似值的目标点比例（10%-20%的位置），以及保持在合理邻域（(<,)5000个邻域）内的最佳带宽。带有目标点的 GWR 的所有估计方法现在都可以在 R 软件包 mgwrsar 中访问，该软件包用于 GWR 和带有或不带有空间自相关性的混合 GWR，可在 CRAN 存储库（https://CRAN.R-project.org/package=mgwrsar）中访问。

{"title":"Speeding up estimation of spatially varying coefficients models","authors":"Ghislain Geniaux","doi":"10.1007/s10109-024-00442-3","DOIUrl":"https://doi.org/10.1007/s10109-024-00442-3","url":null,"abstract":"Spatially varying coefficient models, such as GWR (Brunsdon et al. in Geogr Anal 28:281–298, 1996 and McMillen in J Urban Econ 40:100–124, 1996), find extensive applications across various fields, including housing markets, land use, population ecology, seismology, and mining research. These models are valuable for capturing the spatial heterogeneity of coefficient values. In many application areas, the continuous expansion of spatial data sample sizes, in terms of both volume and richness of explanatory variables, has given rise to new methodological challenges. The primary issues revolve around the time required to calculate each local coefficients and the memory requirements imposed for storing the large hat matrix (of size (n times n)) for parameter variance estimation. Researchers have explored various approaches to address these challenges (Harris et al. in Trans GIS 14:43–61, 2010, Pozdnoukhov and Kaiser in: Proceedings of the 19th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, 2011; Tran et al. in: 2016 Eighth International Conference on Knowledge and Systems Engineering (KSE), IEEE, 2016; Geniaux and Martinetti in Reg Sci Urban Econ 72:74–85, 2018; Li et al. in Int J Geogr Inf Sci 33:155–175, 2019; Murakami et al. in Ann Am Assoc Geogr 111:459–480, 2020). While the use of a subset of target points for local regressions has been extensively studied in nonparametric econometrics, its application within the context of GWR has been relatively unexplored. In this paper, we propose an original two-stage method designed to accelerate GWR computations. We select a subset of target points based on the spatial smoothing of residuals from a first-stage regression, conducting GWR solely on this subsample. Additionally, we propose an original approach for extrapolating coefficients to non-target points. In addition to using an effective sample of target points, we explore the computational gain provided by using truncated Gaussian kernel to create sparser matrices during computation. Our Monte Carlo experiments demonstrate that this method of target point selection outperforms methods based on point density or random selection. The results also reveal that using target points can reduce bias and root mean square error (RMSE) in estimating (beta) coefficients compared to traditional GWR, as it enables the selection of a more accurate bandwidth size. We demonstrate that our estimator is scalable and exhibits superior properties in this regard compared to the (Murakami et al. in Ann Am Assoc Geogr 111:459–480, 2020) estimator under two conditions: the use of a ratio of target points that provides satisfactory approximation of coefficients (10–20 % of locations) and an optimal bandwidth that remains within a reasonable neighborhood ((<,)5000 neighbors). All the estimator of GWR with target pointsare now accessible in the R package mgwrsar for GWR and Mixed GWR wit","PeriodicalId":47245,"journal":{"name":"Journal of Geographical Systems","volume":"73 1","pages":""},"PeriodicalIF":2.9,"publicationDate":"2024-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141504086","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Arthur Getis: a legend in geographical systems 阿瑟-格蒂斯：地理系统中的传奇人物

IF 2.9 3区地球科学 Q1 GEOGRAPHY

Journal of Geographical Systems

Pub Date : 2024-06-19 DOI: 10.1007/s10109-024-00443-2

Alan T. Murray, Luc Anselin, Sergio J. Rey

The passing of Professor Arthur Getis in May of 2022 initiated a number of events to both reflect on and remember the tremendous contributions he has made over his career to geographical systems more broadly, but also spatial analysis, spatial statistics, regional science, geography and GIScience, among others. This began with a series of sessions at the North American Regional Science Council meetings held in Montreal, Canada in November of 2022, followed by an invitation from the editors of Journal of Geographical Systems, Manfred Fischer, Antonio Paez, and Petra Staufer-Steinnocher, to organize a special issue in his honor. Soon thereafter, an open call was initiated to solicit submissions to this special issue, seeking original contributions that overlap and complement the broad range of research undertaken by Professor Getis over a distinguished career and life. This paper offers an overview of prominent geographical systems work carried out by Professor Getis. Reflections on his many contributions are also detailed. A summary of the contributions to this special issue is given, along with final thoughts.

阿瑟-格蒂斯教授于 2022 年 5 月逝世，他的逝世引发了一系列活动，以反思和缅怀他在职业生涯中为地理系统、空间分析、空间统计、区域科学、地理学和地理信息系统科学等领域做出的巨大贡献。首先是在 2022 年 11 月于加拿大蒙特利尔举行的北美区域科学理事会会议上举行了一系列会议，随后《地理系统学报》的编辑 Manfred Fischer、Antonio Paez 和 Petra Staufer-Steinnocher 邀请他组织一期特刊来纪念他。此后不久，我们便开始公开征集特刊稿件，希望能征集到与 Getis 教授杰出的职业生涯和一生所从事的广泛研究相重叠和互补的原创性稿件。本文概述了 Getis 教授在地理系统方面所做的杰出工作。本文还详细介绍了对其众多贡献的反思。本文还对本特刊的贡献进行了总结，并提出了最后的想法。

引用次数: 0

The distance decay effect and spatial reach of spillovers 距离衰减效应和溢出效应的空间范围

IF 2.9 3区地球科学 Q1 GEOGRAPHY

Journal of Geographical Systems

Pub Date : 2024-05-18 DOI: 10.1007/s10109-024-00440-5

J. Paul Elhorst, Ioanna Tziolas, Chang Tan, Petros Milionis

This paper quantifies and graphically illustrates the distance decay effect and spatial reach of spillover effects derived from a spatial Durbin (SD) model with parameterized spatial weight matrices. Building on attributes of the concept of spatial autocorrelation developed by Arthur Getis, we adopt a distance-based negative exponential spatial weight matrix and parameterize it by a decay parameter that is different for each spatial lag in this model, both of the regressand and of all regressors. The quantification and illustration are applied to the spatially augmented neoclassical growth framework, which we estimate using data for 266 NUTS-2 regions in the EU over the period 2000–2018. We find distance decay parameters ranging from 0.233 to 2.224 and spatial reaches ranging from 700 to more than 1500 km for the different growth determinants in this model. These wide ranges highlight the restrictiveness of the conventional SD model based on one common spatial weight matrix for all spatial lags.

本文用参数化的空间权重矩阵量化并图解了空间杜宾（SD）模型的距离衰减效应和溢出效应的空间范围。基于 Arthur Getis 提出的空间自相关概念的属性，我们采用了基于距离的负指数空间权重矩阵，并通过衰减参数对其进行参数化。我们利用 2000-2018 年期间欧盟 266 个 NUTS-2 地区的数据，对空间增强新古典增长框架进行了估算。我们发现，该模型中不同增长决定因素的距离衰减参数从 0.233 到 2.224 不等，空间范围从 700 到超过 1500 公里不等。这些广泛的范围凸显了基于所有空间滞后的一个共同空间权重矩阵的传统 SD 模型的局限性。

引用次数: 0

Rethinking the null hypothesis in significant colocation pattern mining of spatial flows 反思空间流重大同地模式挖掘中的零假设

IF 2.9 3区地球科学 Q1 GEOGRAPHY

Journal of Geographical Systems

Pub Date : 2024-05-03 DOI: 10.1007/s10109-024-00439-y

Mengjie Zhou, Mengjie Yang, Tinghua Ai, Jiannan Cai, Zhe Chen

Spatial flows represent spatial interactions or movements. Mining colocation patterns of different types of flows may uncover the spatial dependences and associations among flows. Previous studies proposed a flow colocation pattern mining method and established a significance test under the null hypothesis of independence for the results. In fact, the definition of the null hypothesis is crucial in significance testing. Choosing an inappropriate null hypothesis may lead to misunderstandings about the spatial interactions between flows. In practice, the overall distribution patterns of different types of flows may be clustered. In these cases, the null hypothesis of independence will result in unconvincing results. Thus, considering the overall spatial pattern of flows, in this study, we changed the null hypothesis to random labeling to establish the statistical significance of flow colocation patterns. Furthermore, we compared and analyzed the impacts of different null hypotheses on flow colocation pattern mining through synthetic data tests with different preset patterns and situations. Additionally, we used empirical data from ride-hailing trips to show the practicality of the method.

空间流代表空间互动或移动。挖掘不同类型流量的同位模式可以揭示流量之间的空间依赖性和关联性。以往的研究提出了流量同位模式挖掘方法，并为结果建立了独立性零假设下的显著性检验。事实上，零假设的定义在显著性检验中至关重要。选择不恰当的零假设可能会导致对水流之间空间相互作用的误解。在实践中，不同类型流量的总体分布模式可能是聚类的。在这种情况下，独立的零假设将导致难以令人信服的结果。因此，考虑到流量的整体空间模式，在本研究中，我们将零假设改为随机标记，以确定流量聚落模式的统计意义。此外，我们还通过不同预设模式和情况下的合成数据测试，比较和分析了不同零假设对流量定位模式挖掘的影响。此外，我们还使用了打车出行的经验数据来证明该方法的实用性。

{"title":"Rethinking the null hypothesis in significant colocation pattern mining of spatial flows","authors":"Mengjie Zhou, Mengjie Yang, Tinghua Ai, Jiannan Cai, Zhe Chen","doi":"10.1007/s10109-024-00439-y","DOIUrl":"https://doi.org/10.1007/s10109-024-00439-y","url":null,"abstract":"Spatial flows represent spatial interactions or movements. Mining colocation patterns of different types of flows may uncover the spatial dependences and associations among flows. Previous studies proposed a flow colocation pattern mining method and established a significance test under the null hypothesis of independence for the results. In fact, the definition of the null hypothesis is crucial in significance testing. Choosing an inappropriate null hypothesis may lead to misunderstandings about the spatial interactions between flows. In practice, the overall distribution patterns of different types of flows may be clustered. In these cases, the null hypothesis of independence will result in unconvincing results. Thus, considering the overall spatial pattern of flows, in this study, we changed the null hypothesis to random labeling to establish the statistical significance of flow colocation patterns. Furthermore, we compared and analyzed the impacts of different null hypotheses on flow colocation pattern mining through synthetic data tests with different preset patterns and situations. Additionally, we used empirical data from ride-hailing trips to show the practicality of the method.","PeriodicalId":47245,"journal":{"name":"Journal of Geographical Systems","volume":"11 1","pages":""},"PeriodicalIF":2.9,"publicationDate":"2024-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140889486","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0