A districting problem with data reliability constraints for equity analysis

IF 7.6 1区 工程技术 Q1 TRANSPORTATION SCIENCE & TECHNOLOGY Transportation Research Part C-Emerging Technologies Pub Date : 2024-07-10 DOI:10.1016/j.trc.2024.104759
Bingqing Liu , Farnoosh Namdarpour , Joseph Y.J. Chow
{"title":"A districting problem with data reliability constraints for equity analysis","authors":"Bingqing Liu ,&nbsp;Farnoosh Namdarpour ,&nbsp;Joseph Y.J. Chow","doi":"10.1016/j.trc.2024.104759","DOIUrl":null,"url":null,"abstract":"<div><p>While data plays an important role in transportation research, sampled data is not always reliable. Data reliability issue is significant especially for minority groups. In this study, a districting approach is proposed which improves data reliability through aggregation of basic spatial units (BSU), adapted from a max-p-regions problem. The model generates as many aggregated zones as possible that minimize intrazonal heterogeneity while minimizing data margin of error (MOE) of all aggregated zones using a controlling MOE threshold. The problem is first formulated as an integer programming which selects optimal set of zones from a pre-generated set of candidate zones. The difficulty of solving the formulation lies in the generation of the candidate set, so a heuristic solution algorithm is proposed. Two case studies are provided to illustrate the method and validate its performance by evaluating the resulting data quality in an example subsequent planning model. First is an area in Downtown Manhattan with 62 census tracts, comparing the aggregated zones with Neighborhood Tabulation Areas (NTAs) and Taxi Zones. Second is the generation of the New York City Equitable Zoning (NYCEZ), which generated 574 Equitable Zones that reduce the average MOE% of demographic data by 48% for seniors, 75% for low-income population, and 46% for long commuters, all with a district number that is higher than NTAs (2<!--> <!-->2<!--> <!-->1) and Taxi Zones (2<!--> <!-->6<!--> <!-->3). NYCEZ and census tracts are then compared in a subsequent model, synthetic population generation, showing an improvement of 6.2% in standard deviation across simulated populations under the proposed zone design. NYCEZ showed smaller variation in the generated population data. The algorithm can help the decision making of public agencies and the service design of mobility providers by producing reliable and equitable data. The algorithm can also be applied to data-sharing between mobility providers and agencies to alleviate privacy concerns.</p></div>","PeriodicalId":54417,"journal":{"name":"Transportation Research Part C-Emerging Technologies","volume":null,"pages":null},"PeriodicalIF":7.6000,"publicationDate":"2024-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Transportation Research Part C-Emerging Technologies","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0968090X24002808","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"TRANSPORTATION SCIENCE & TECHNOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

While data plays an important role in transportation research, sampled data is not always reliable. Data reliability issue is significant especially for minority groups. In this study, a districting approach is proposed which improves data reliability through aggregation of basic spatial units (BSU), adapted from a max-p-regions problem. The model generates as many aggregated zones as possible that minimize intrazonal heterogeneity while minimizing data margin of error (MOE) of all aggregated zones using a controlling MOE threshold. The problem is first formulated as an integer programming which selects optimal set of zones from a pre-generated set of candidate zones. The difficulty of solving the formulation lies in the generation of the candidate set, so a heuristic solution algorithm is proposed. Two case studies are provided to illustrate the method and validate its performance by evaluating the resulting data quality in an example subsequent planning model. First is an area in Downtown Manhattan with 62 census tracts, comparing the aggregated zones with Neighborhood Tabulation Areas (NTAs) and Taxi Zones. Second is the generation of the New York City Equitable Zoning (NYCEZ), which generated 574 Equitable Zones that reduce the average MOE% of demographic data by 48% for seniors, 75% for low-income population, and 46% for long commuters, all with a district number that is higher than NTAs (2 2 1) and Taxi Zones (2 6 3). NYCEZ and census tracts are then compared in a subsequent model, synthetic population generation, showing an improvement of 6.2% in standard deviation across simulated populations under the proposed zone design. NYCEZ showed smaller variation in the generated population data. The algorithm can help the decision making of public agencies and the service design of mobility providers by producing reliable and equitable data. The algorithm can also be applied to data-sharing between mobility providers and agencies to alleviate privacy concerns.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
用于公平分析的具有数据可靠性限制的选区问题
虽然数据在交通研究中发挥着重要作用,但抽样数据并不总是可靠的。数据可靠性问题非常重要,尤其是对少数群体而言。本研究提出了一种分区方法,通过聚合基本空间单元(BSU)来提高数据可靠性。该模型可生成尽可能多的聚合区,最大限度地减少区内异质性,同时利用控制 MOE 临界值最大限度地减少所有聚合区的数据误差率 (MOE)。该问题首先以整数编程的形式提出,从预先生成的候选区中选择最佳区集。解决该问题的难点在于候选区集的生成,因此提出了一种启发式求解算法。本文提供了两个案例研究来说明该方法,并通过评估后续规划模型中的数据质量来验证该方法的性能。首先是曼哈顿市中心一个拥有 62 个人口普查区的地区,将汇总区与邻里统计区(NTA)和出租车区进行比较。其次是纽约市公平分区(NYCEZ)的生成,该分区生成了 574 个公平分区,将人口数据的平均 MOE%降低了 48%(老年人)、75%(低收入人口)和 46%(长期通勤者),所有分区的数量均高于 NTAs(2 2 1)和 Taxi Zones(2 6 3)。NYCEZ 和人口普查区随后在合成人口生成模型中进行了比较,结果显示,在拟议的区域设计下,模拟人口的标准偏差提高了 6.2%。在生成的人口数据中,NYCEZ 的变化较小。通过生成可靠、公平的数据,该算法有助于公共机构的决策和流动性提供商的服务设计。该算法还可用于流动性提供商和机构之间的数据共享,以减轻对隐私的担忧。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
15.80
自引率
12.00%
发文量
332
审稿时长
64 days
期刊介绍: Transportation Research: Part C (TR_C) is dedicated to showcasing high-quality, scholarly research that delves into the development, applications, and implications of transportation systems and emerging technologies. Our focus lies not solely on individual technologies, but rather on their broader implications for the planning, design, operation, control, maintenance, and rehabilitation of transportation systems, services, and components. In essence, the intellectual core of the journal revolves around the transportation aspect rather than the technology itself. We actively encourage the integration of quantitative methods from diverse fields such as operations research, control systems, complex networks, computer science, and artificial intelligence. Join us in exploring the intersection of transportation systems and emerging technologies to drive innovation and progress in the field.
期刊最新文献
An environmentally-aware dynamic planning of electric vehicles for aircraft towing considering stochastic aircraft arrival and departure times Network-wide speed–flow estimation considering uncertain traffic conditions and sparse multi-type detectors: A KL divergence-based optimization approach Revealing the impacts of COVID-19 pandemic on intercity truck transport: New insights from big data analytics MATNEC: AIS data-driven environment-adaptive maritime traffic network construction for realistic route generation A qualitative AI security risk assessment of autonomous vehicles
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1