Copula-based transferable models for synthetic population generation

IF 7.6 1区 工程技术 Q1 TRANSPORTATION SCIENCE & TECHNOLOGY Transportation Research Part C-Emerging Technologies Pub Date : 2024-09-09 DOI:10.1016/j.trc.2024.104830
{"title":"Copula-based transferable models for synthetic population generation","authors":"","doi":"10.1016/j.trc.2024.104830","DOIUrl":null,"url":null,"abstract":"<div><p>Population synthesis involves generating synthetic yet realistic representations of a target population of micro-agents for behavioral modeling and simulation. Traditional methods, often reliant on target population samples, such as census data or travel surveys, face limitations due to high costs and small sample sizes, particularly at smaller geographical scales. We propose a novel framework based on copulas to generate synthetic data for target populations where only empirical marginal distributions are known. This method utilizes samples from different populations with similar marginal dependencies, introduces a spatial component into population synthesis, and considers various information sources for more realistic generators. Concretely, the process involves normalizing the data and treating it as realizations of a given copula, and then training a generative model before incorporating the information on the marginals of the target population. Utilizing American Community Survey data, we assess our framework’s performance through standardized root mean squared error (SRMSE) and so-called sampled zeros. We focus on its capacity to transfer a model learned from one population to another. Our experiments include transfer tests between regions at the same geographical level as well as to lower geographical levels, hence evaluating the framework’s adaptability in varied spatial contexts. We compare Bayesian Networks, Variational Autoencoders, and Generative Adversarial Networks, both individually and combined with our copula framework. Results show that the copula enhances machine learning methods in matching the marginals of the reference data. Furthermore, it consistently surpasses Iterative Proportional Fitting in terms of SRMSE in the transferability experiments, while introducing unique observations not found in the original training sample.</p></div>","PeriodicalId":54417,"journal":{"name":"Transportation Research Part C-Emerging Technologies","volume":null,"pages":null},"PeriodicalIF":7.6000,"publicationDate":"2024-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Transportation Research Part C-Emerging Technologies","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0968090X24003516","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"TRANSPORTATION SCIENCE & TECHNOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Population synthesis involves generating synthetic yet realistic representations of a target population of micro-agents for behavioral modeling and simulation. Traditional methods, often reliant on target population samples, such as census data or travel surveys, face limitations due to high costs and small sample sizes, particularly at smaller geographical scales. We propose a novel framework based on copulas to generate synthetic data for target populations where only empirical marginal distributions are known. This method utilizes samples from different populations with similar marginal dependencies, introduces a spatial component into population synthesis, and considers various information sources for more realistic generators. Concretely, the process involves normalizing the data and treating it as realizations of a given copula, and then training a generative model before incorporating the information on the marginals of the target population. Utilizing American Community Survey data, we assess our framework’s performance through standardized root mean squared error (SRMSE) and so-called sampled zeros. We focus on its capacity to transfer a model learned from one population to another. Our experiments include transfer tests between regions at the same geographical level as well as to lower geographical levels, hence evaluating the framework’s adaptability in varied spatial contexts. We compare Bayesian Networks, Variational Autoencoders, and Generative Adversarial Networks, both individually and combined with our copula framework. Results show that the copula enhances machine learning methods in matching the marginals of the reference data. Furthermore, it consistently surpasses Iterative Proportional Fitting in terms of SRMSE in the transferability experiments, while introducing unique observations not found in the original training sample.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
用于合成种群生成的基于 Copula 的可转移模型
人口合成包括生成用于行为建模和模拟的微型代理目标人口的合成但真实的代表。传统方法通常依赖于目标人群样本,如人口普查数据或旅行调查,但由于成本高、样本量小,尤其是在较小的地理范围内,这些方法面临着局限性。我们提出了一种基于协方差的新型框架,用于在仅知道经验边际分布的情况下生成目标人群的合成数据。该方法利用具有相似边际依赖关系的不同人群样本,在人群合成中引入空间成分,并考虑各种信息来源,以生成更真实的数据。具体来说,这一过程包括对数据进行归一化处理,并将其视为给定 copula 的实现,然后在纳入目标人群的边际信息之前对生成模型进行训练。利用美国社区调查数据,我们通过标准化均方根误差(SRMSE)和所谓的抽样零点来评估我们框架的性能。我们的重点是其将从一个人群中学到的模型转移到另一个人群的能力。我们的实验包括在同一地理层次的区域之间以及向较低地理层次的转移测试,从而评估该框架在不同空间环境下的适应性。我们比较了贝叶斯网络、变异自动编码器和生成对抗网络,既有单独的,也有与我们的 copula 框架相结合的。结果表明,copula 增强了机器学习方法在匹配参考数据边际方面的能力。此外,在可转移性实验中,它的 SRMSE 一直超过迭代比例拟合,同时引入了原始训练样本中没有的独特观察结果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
15.80
自引率
12.00%
发文量
332
审稿时长
64 days
期刊介绍: Transportation Research: Part C (TR_C) is dedicated to showcasing high-quality, scholarly research that delves into the development, applications, and implications of transportation systems and emerging technologies. Our focus lies not solely on individual technologies, but rather on their broader implications for the planning, design, operation, control, maintenance, and rehabilitation of transportation systems, services, and components. In essence, the intellectual core of the journal revolves around the transportation aspect rather than the technology itself. We actively encourage the integration of quantitative methods from diverse fields such as operations research, control systems, complex networks, computer science, and artificial intelligence. Join us in exploring the intersection of transportation systems and emerging technologies to drive innovation and progress in the field.
期刊最新文献
An environmentally-aware dynamic planning of electric vehicles for aircraft towing considering stochastic aircraft arrival and departure times Network-wide speed–flow estimation considering uncertain traffic conditions and sparse multi-type detectors: A KL divergence-based optimization approach Revealing the impacts of COVID-19 pandemic on intercity truck transport: New insights from big data analytics MATNEC: AIS data-driven environment-adaptive maritime traffic network construction for realistic route generation A qualitative AI security risk assessment of autonomous vehicles
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1