Extracting the U.S. building types from OpenStreetMap data

Henrique F. de Arruda, Sandro M. Reia, Shiyang Ruan, Kuldip S. Atwal, Hamdi Kavak, Taylor Anderson, Dieter Pfoser
{"title":"Extracting the U.S. building types from OpenStreetMap data","authors":"Henrique F. de Arruda, Sandro M. Reia, Shiyang Ruan, Kuldip S. Atwal, Hamdi Kavak, Taylor Anderson, Dieter Pfoser","doi":"arxiv-2409.05692","DOIUrl":null,"url":null,"abstract":"Building type information is crucial for population estimation, traffic\nplanning, urban planning, and emergency response applications. Although\nessential, such data is often not readily available. To alleviate this problem,\nthis work creates a comprehensive dataset by providing\nresidential/non-residential building classification covering the entire United\nStates. We propose and utilize an unsupervised machine learning method to\nclassify building types based on building footprints and available\nOpenStreetMap information. The classification result is validated using\nauthoritative ground truth data for select counties in the U.S. The validation\nshows a high precision for non-residential building classification and a high\nrecall for residential buildings. We identified various approaches to improving\nthe quality of the classification, such as removing sheds and garages from the\ndataset. Furthermore, analyzing the misclassifications revealed that they are\nmainly due to missing and scarce metadata in OSM. A major result of this work\nis the resulting dataset of classifying 67,705,475 buildings. We hope that this\ndata is of value to the scientific community, including urban and\ntransportation planners.","PeriodicalId":501032,"journal":{"name":"arXiv - CS - Social and Information Networks","volume":"120 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Social and Information Networks","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.05692","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Building type information is crucial for population estimation, traffic planning, urban planning, and emergency response applications. Although essential, such data is often not readily available. To alleviate this problem, this work creates a comprehensive dataset by providing residential/non-residential building classification covering the entire United States. We propose and utilize an unsupervised machine learning method to classify building types based on building footprints and available OpenStreetMap information. The classification result is validated using authoritative ground truth data for select counties in the U.S. The validation shows a high precision for non-residential building classification and a high recall for residential buildings. We identified various approaches to improving the quality of the classification, such as removing sheds and garages from the dataset. Furthermore, analyzing the misclassifications revealed that they are mainly due to missing and scarce metadata in OSM. A major result of this work is the resulting dataset of classifying 67,705,475 buildings. We hope that this data is of value to the scientific community, including urban and transportation planners.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
从 OpenStreetMap 数据中提取美国建筑类型
建筑类型信息对于人口估计、交通规划、城市规划和应急响应应用至关重要。尽管非常重要,但此类数据往往不易获得。为了缓解这一问题,这项工作通过提供覆盖全美的住宅/非住宅建筑分类,创建了一个综合数据集。我们提出并使用了一种无监督机器学习方法,根据建筑物占地面积和可用的 OpenStreetMap 信息对建筑物类型进行分类。我们使用美国部分郡县的权威地面实况数据对分类结果进行了验证。验证结果表明,非住宅建筑分类的精确度很高,而住宅建筑分类的召回率很高。我们确定了提高分类质量的各种方法,例如从数据集中移除棚屋和车库。此外,对错误分类的分析表明,这些错误分类主要是由于 OSM 中元数据的缺失和匮乏造成的。这项工作的一个主要成果是建立了一个数据集,对 67 705 475 幢建筑物进行了分类。我们希望这些数据能对科学界,包括城市和交通规划者有所帮助。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
My Views Do Not Reflect Those of My Employer: Differences in Behavior of Organizations' Official and Personal Social Media Accounts A novel DFS/BFS approach towards link prediction Community Shaping in the Digital Age: A Temporal Fusion Framework for Analyzing Discourse Fragmentation in Online Social Networks Skill matching at scale: freelancer-project alignment for efficient multilingual candidate retrieval "It Might be Technically Impressive, But It's Practically Useless to Us": Practices, Challenges, and Opportunities for Cross-Functional Collaboration around AI within the News Industry
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1