{"title":"利用机器学习生成统一完整的德国建筑高度数据集","authors":"Kristina Dabrock , Noah Pflugradt , Jann Michael Weinand , Detlef Stolten","doi":"10.1016/j.egyai.2024.100408","DOIUrl":null,"url":null,"abstract":"<div><p>Building geometry data is crucial for detailed, spatially-explicit analyses of the building stock in energy systems analysis and beyond. Despite the existence of diverse datasets and methods, a standardized and validated approach for creating a nation-wide unified and complete dataset of German building heights is not yet available. This study develops and validates such a methodology, combining different data sources for building footprints and heights and filling gaps in height data using an XGBoost machine learning algorithm. The XGBoost model achieves a mean absolute error of 1.78 m at the national level and between 1.52 m and 3.47 m at the federal state level. The goal is proving the applicability of the methodology at a large scale and creating a useful dataset. The resulting dataset is thoroughly evaluated on a building-by-building level and spatially resolved statistics on the quality of the dataset are reported. This detailed validation found that the building number and footprint area of German building stock is 90.31 % and 94.84 % correct, respectively, and the building height accuracy is 0.59 m at the national level. However, errors are not homogeneous across Germany and further research is needed into the impact of including additional datasets, especially for regions and building types with lower accuracies. This study proves that the chosen methodology is useful for generating a building height dataset and the workflow, with some modifications for regional data availability, can be transferred to other countries. The generated building dataset for Germany constitutes a valuable data basis for the research community in fields such as energy research, urban planning and building decarbonization policy development.</p></div>","PeriodicalId":34138,"journal":{"name":"Energy and AI","volume":null,"pages":null},"PeriodicalIF":9.6000,"publicationDate":"2024-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666546824000740/pdfft?md5=0c0b5b01fe19056c6830a6c702ac7eb8&pid=1-s2.0-S2666546824000740-main.pdf","citationCount":"0","resultStr":"{\"title\":\"Leveraging machine learning to generate a unified and complete building height dataset for Germany\",\"authors\":\"Kristina Dabrock , Noah Pflugradt , Jann Michael Weinand , Detlef Stolten\",\"doi\":\"10.1016/j.egyai.2024.100408\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Building geometry data is crucial for detailed, spatially-explicit analyses of the building stock in energy systems analysis and beyond. Despite the existence of diverse datasets and methods, a standardized and validated approach for creating a nation-wide unified and complete dataset of German building heights is not yet available. This study develops and validates such a methodology, combining different data sources for building footprints and heights and filling gaps in height data using an XGBoost machine learning algorithm. The XGBoost model achieves a mean absolute error of 1.78 m at the national level and between 1.52 m and 3.47 m at the federal state level. The goal is proving the applicability of the methodology at a large scale and creating a useful dataset. The resulting dataset is thoroughly evaluated on a building-by-building level and spatially resolved statistics on the quality of the dataset are reported. This detailed validation found that the building number and footprint area of German building stock is 90.31 % and 94.84 % correct, respectively, and the building height accuracy is 0.59 m at the national level. However, errors are not homogeneous across Germany and further research is needed into the impact of including additional datasets, especially for regions and building types with lower accuracies. This study proves that the chosen methodology is useful for generating a building height dataset and the workflow, with some modifications for regional data availability, can be transferred to other countries. The generated building dataset for Germany constitutes a valuable data basis for the research community in fields such as energy research, urban planning and building decarbonization policy development.</p></div>\",\"PeriodicalId\":34138,\"journal\":{\"name\":\"Energy and AI\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":9.6000,\"publicationDate\":\"2024-07-31\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S2666546824000740/pdfft?md5=0c0b5b01fe19056c6830a6c702ac7eb8&pid=1-s2.0-S2666546824000740-main.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Energy and AI\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2666546824000740\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Energy and AI","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666546824000740","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
Leveraging machine learning to generate a unified and complete building height dataset for Germany
Building geometry data is crucial for detailed, spatially-explicit analyses of the building stock in energy systems analysis and beyond. Despite the existence of diverse datasets and methods, a standardized and validated approach for creating a nation-wide unified and complete dataset of German building heights is not yet available. This study develops and validates such a methodology, combining different data sources for building footprints and heights and filling gaps in height data using an XGBoost machine learning algorithm. The XGBoost model achieves a mean absolute error of 1.78 m at the national level and between 1.52 m and 3.47 m at the federal state level. The goal is proving the applicability of the methodology at a large scale and creating a useful dataset. The resulting dataset is thoroughly evaluated on a building-by-building level and spatially resolved statistics on the quality of the dataset are reported. This detailed validation found that the building number and footprint area of German building stock is 90.31 % and 94.84 % correct, respectively, and the building height accuracy is 0.59 m at the national level. However, errors are not homogeneous across Germany and further research is needed into the impact of including additional datasets, especially for regions and building types with lower accuracies. This study proves that the chosen methodology is useful for generating a building height dataset and the workflow, with some modifications for regional data availability, can be transferred to other countries. The generated building dataset for Germany constitutes a valuable data basis for the research community in fields such as energy research, urban planning and building decarbonization policy development.