The Regionalization and Aggregation of In‐App Location Data to Maximize Information and Minimize Data Disclosure

IF 3.3 3区 地球科学 Q1 GEOGRAPHY Geographical Analysis Pub Date : 2024-06-07 DOI:10.1111/gean.12406
Louise Sieg, James Cheshire
{"title":"The Regionalization and Aggregation of In‐App Location Data to Maximize Information and Minimize Data Disclosure","authors":"Louise Sieg, James Cheshire","doi":"10.1111/gean.12406","DOIUrl":null,"url":null,"abstract":"To minimize the disclosure of personal information, sensitive location data collected by mobile phones is often aggregated to predefined geographic units and presented as counts of devices at a given time. The use of grids or units created by statistical agencies for the dissemination of traditional data sets—such as censuses—are common choices for this aggregation process. However, these can result in large variations in the number of devices encapsulated within each geographic unit, resulting in over‐generalization and a loss of information in some areas. To alleviate this issue, we propose a new method for the aggregation of mobile phone generated location data sets that creates bespoke geometries that maximize the granularity of the data, whilst minimizing the risks of disclosing personal information. The resulting small areas are built on Uber's H3 hexagonal indexing system by attributing activity counts and land‐use features to each cell, then merging cells into geographies containing a predetermined number of data points and respecting the underlying topography and land use. This methodology has applications to widely available data sets and enables bespoke geographical units to be created for different contexts. We compare the generated units to established aggregates from the England and Wales Census and Ordnance Survey. We demonstrate that our outputs are more representative of the original mobile phone data set and minimize data omission caused by low counts. This speaks to the need for a data‐driven and context‐driven regionalization methodology.","PeriodicalId":12533,"journal":{"name":"Geographical Analysis","volume":null,"pages":null},"PeriodicalIF":3.3000,"publicationDate":"2024-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Geographical Analysis","FirstCategoryId":"89","ListUrlMain":"https://doi.org/10.1111/gean.12406","RegionNum":3,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"GEOGRAPHY","Score":null,"Total":0}
引用次数: 0

Abstract

To minimize the disclosure of personal information, sensitive location data collected by mobile phones is often aggregated to predefined geographic units and presented as counts of devices at a given time. The use of grids or units created by statistical agencies for the dissemination of traditional data sets—such as censuses—are common choices for this aggregation process. However, these can result in large variations in the number of devices encapsulated within each geographic unit, resulting in over‐generalization and a loss of information in some areas. To alleviate this issue, we propose a new method for the aggregation of mobile phone generated location data sets that creates bespoke geometries that maximize the granularity of the data, whilst minimizing the risks of disclosing personal information. The resulting small areas are built on Uber's H3 hexagonal indexing system by attributing activity counts and land‐use features to each cell, then merging cells into geographies containing a predetermined number of data points and respecting the underlying topography and land use. This methodology has applications to widely available data sets and enables bespoke geographical units to be created for different contexts. We compare the generated units to established aggregates from the England and Wales Census and Ordnance Survey. We demonstrate that our outputs are more representative of the original mobile phone data set and minimize data omission caused by low counts. This speaks to the need for a data‐driven and context‐driven regionalization methodology.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
应用内位置数据的区域化和聚合,实现信息最大化和数据披露最小化
为了最大限度地减少个人信息的泄露,移动电话收集的敏感位置数据通常被汇总到预定义的地理单元,并以特定时间内的设备计数形式呈现。使用网格或统计机构为传播传统数据集(如人口普查)而创建的单位是这种汇总过程的常见选择。然而,这可能会导致每个地理单元内所包含的设备数量差异很大,从而造成过度概括和某些地区的信息丢失。为了缓解这一问题,我们提出了一种聚合手机生成的位置数据集的新方法,该方法可创建定制的几何图形,从而最大限度地提高数据的粒度,同时将披露个人信息的风险降至最低。由此产生的小区域以 Uber 的 H3 六边形索引系统为基础,将活动计数和土地使用特征归属于每个单元格,然后将单元格合并为包含预定数量数据点的地理区域,并尊重底层地形和土地使用情况。这种方法适用于广泛可用的数据集,并可根据不同情况创建定制的地理单元。我们将生成的地理单元与英格兰和威尔士人口普查以及英国国家测绘局(Ordnance Survey)的既定综合数据进行比较。我们证明,我们的输出结果更能代表原始的移动电话数据集,并最大限度地减少了因计数低而造成的数据遗漏。这说明需要一种由数据和背景驱动的区域化方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
8.70
自引率
5.60%
发文量
40
期刊介绍: First in its specialty area and one of the most frequently cited publications in geography, Geographical Analysis has, since 1969, presented significant advances in geographical theory, model building, and quantitative methods to geographers and scholars in a wide spectrum of related fields. Traditionally, mathematical and nonmathematical articulations of geographical theory, and statements and discussions of the analytic paradigm are published in the journal. Spatial data analyses and spatial econometrics and statistics are strongly represented.
期刊最新文献
Correction to “A hybrid approach for mass valuation of residential properties through geographic information systems and machine learning integration” Plausible Reasoning and Spatial‐Statistical Theory: A Critique of Recent Writings on “Spatial Confounding” The Regionalization and Aggregation of In‐App Location Data to Maximize Information and Minimize Data Disclosure Geographical Compactness in Shape Assessment Issue Information
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1