ViFi-Loc: Multi-modal Pedestrian Localization using GAN with Camera-Phone Correspondences

Hansi Liu, Hongsheng Lu, Kristin Data, Marco Gruteser
{"title":"ViFi-Loc: Multi-modal Pedestrian Localization using GAN with Camera-Phone Correspondences","authors":"Hansi Liu, Hongsheng Lu, Kristin Data, Marco Gruteser","doi":"10.1145/3577190.3614119","DOIUrl":null,"url":null,"abstract":"In Smart City and Vehicle-to-Everything (V2X) systems, acquiring pedestrians’ accurate locations is crucial to traffic and pedestrian safety. Current systems adopt cameras and wireless sensors to estimate people’s locations via sensor fusion. Standard fusion algorithms, however, become inapplicable when multi-modal data is not associated. For example, pedestrians are out of the camera field of view, or data from the camera modality is missing. To address this challenge and produce more accurate location estimations for pedestrians, we propose a localization solution based on a Generative Adversarial Network (GAN) architecture. During training, it learns the underlying linkage between pedestrians’ camera-phone data correspondences. During inference, it generates refined position estimations based only on pedestrians’ phone data that consists of GPS, IMU, and FTM. Results show that our GAN produces 3D coordinates at 1 to 2 meters localization error across 5 different outdoor scenes. We further show that the proposed model supports self-learning. The generated coordinates can be associated with pedestrians’ bounding box coordinates to obtain additional camera-phone data correspondences. This allows automatic data collection during inference. Results show that after fine-tuning the GAN model on the expanded dataset, localization accuracy is further improved by up to 26%.","PeriodicalId":93171,"journal":{"name":"Companion Publication of the 2020 International Conference on Multimodal Interaction","volume":"94 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Companion Publication of the 2020 International Conference on Multimodal Interaction","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3577190.3614119","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

In Smart City and Vehicle-to-Everything (V2X) systems, acquiring pedestrians’ accurate locations is crucial to traffic and pedestrian safety. Current systems adopt cameras and wireless sensors to estimate people’s locations via sensor fusion. Standard fusion algorithms, however, become inapplicable when multi-modal data is not associated. For example, pedestrians are out of the camera field of view, or data from the camera modality is missing. To address this challenge and produce more accurate location estimations for pedestrians, we propose a localization solution based on a Generative Adversarial Network (GAN) architecture. During training, it learns the underlying linkage between pedestrians’ camera-phone data correspondences. During inference, it generates refined position estimations based only on pedestrians’ phone data that consists of GPS, IMU, and FTM. Results show that our GAN produces 3D coordinates at 1 to 2 meters localization error across 5 different outdoor scenes. We further show that the proposed model supports self-learning. The generated coordinates can be associated with pedestrians’ bounding box coordinates to obtain additional camera-phone data correspondences. This allows automatic data collection during inference. Results show that after fine-tuning the GAN model on the expanded dataset, localization accuracy is further improved by up to 26%.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
ViFi-Loc:使用GAN与相机-手机通信的多模态行人定位
在智慧城市和车联网(V2X)系统中,获取行人的准确位置对交通和行人安全至关重要。目前的系统采用摄像头和无线传感器,通过传感器融合来估计人们的位置。然而,当多模态数据不相关联时,标准的融合算法就不适用了。例如,行人在相机视野之外,或者来自相机模态的数据丢失。为了应对这一挑战并为行人提供更准确的位置估计,我们提出了一种基于生成对抗网络(GAN)架构的定位解决方案。在训练过程中,它学习行人的相机和手机数据通信之间的潜在联系。在推理过程中,它仅基于行人的手机数据生成精细的位置估计,该数据由GPS, IMU和FTM组成。结果表明,我们的GAN在5个不同的室外场景中产生的3D坐标定位误差为1到2米。我们进一步证明了所提出的模型支持自学习。生成的坐标可以与行人的边界框坐标相关联,以获得额外的相机-手机数据对应。这允许在推理期间自动收集数据。结果表明,在扩展数据集上对GAN模型进行微调后,定位精度进一步提高了26%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Gesture Motion Graphs for Few-Shot Speech-Driven Gesture Reenactment The UEA Digital Humans entry to the GENEA Challenge 2023 Deciphering Entrepreneurial Pitches: A Multimodal Deep Learning Approach to Predict Probability of Investment The FineMotion entry to the GENEA Challenge 2023: DeepPhase for conversational gestures generation FEIN-Z: Autoregressive Behavior Cloning for Speech-Driven Gesture Generation
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1