{"title":"利用网络搜索预测柏林公寓租金价格","authors":"Camilo Meyberg, Ulrich Rendtel, Holger Leerhoff","doi":"10.1007/s11943-024-00340-6","DOIUrl":null,"url":null,"abstract":"<div><p>Internet data pose a challenge to the traditional system of official statistics, which relies on more conventional sources such as surveys and registers, not readily adaptable to rapid changes. Expanding this system to include internet data is currently at an experimental stage, exploring these sources’ potentials and benefits. This paper describes a project conducted within the ESSnet <i>Trusted Smart Statistics – Web Intelligence Network</i> framework. It investigates the use of online apartment listings to analyze the rental market. We used web scraping to extract information from two online real estate portals for flats in the city of Berlin. Using this data, we developed a model to predict rental prices per square meter based on the accommodation’s features and location within the city. We detected offers which appear in both portals by means of statistical matching and removed duplicate offers. Missing values were treated by multiple imputation. The prediction model is a semi-parametric approach where the postal districts are used to describe the location effect. Comparisons with microcensus results and the local rent index reveal significant differences between the market of online flat offers and the stock of existing flat contracts. Interested readers will find the commented programming code in the internet supplement.</p></div>","PeriodicalId":100134,"journal":{"name":"AStA Wirtschafts- und Sozialstatistisches Archiv","volume":"18 2","pages":"245 - 278"},"PeriodicalIF":0.0000,"publicationDate":"2024-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s11943-024-00340-6.pdf","citationCount":"0","resultStr":"{\"title\":\"Flat rent price prediction in Berlin with web scraping\",\"authors\":\"Camilo Meyberg, Ulrich Rendtel, Holger Leerhoff\",\"doi\":\"10.1007/s11943-024-00340-6\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Internet data pose a challenge to the traditional system of official statistics, which relies on more conventional sources such as surveys and registers, not readily adaptable to rapid changes. Expanding this system to include internet data is currently at an experimental stage, exploring these sources’ potentials and benefits. This paper describes a project conducted within the ESSnet <i>Trusted Smart Statistics – Web Intelligence Network</i> framework. It investigates the use of online apartment listings to analyze the rental market. We used web scraping to extract information from two online real estate portals for flats in the city of Berlin. Using this data, we developed a model to predict rental prices per square meter based on the accommodation’s features and location within the city. We detected offers which appear in both portals by means of statistical matching and removed duplicate offers. Missing values were treated by multiple imputation. The prediction model is a semi-parametric approach where the postal districts are used to describe the location effect. Comparisons with microcensus results and the local rent index reveal significant differences between the market of online flat offers and the stock of existing flat contracts. Interested readers will find the commented programming code in the internet supplement.</p></div>\",\"PeriodicalId\":100134,\"journal\":{\"name\":\"AStA Wirtschafts- und Sozialstatistisches Archiv\",\"volume\":\"18 2\",\"pages\":\"245 - 278\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-06-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://link.springer.com/content/pdf/10.1007/s11943-024-00340-6.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"AStA Wirtschafts- und Sozialstatistisches Archiv\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://link.springer.com/article/10.1007/s11943-024-00340-6\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"AStA Wirtschafts- und Sozialstatistisches Archiv","FirstCategoryId":"1085","ListUrlMain":"https://link.springer.com/article/10.1007/s11943-024-00340-6","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Flat rent price prediction in Berlin with web scraping
Internet data pose a challenge to the traditional system of official statistics, which relies on more conventional sources such as surveys and registers, not readily adaptable to rapid changes. Expanding this system to include internet data is currently at an experimental stage, exploring these sources’ potentials and benefits. This paper describes a project conducted within the ESSnet Trusted Smart Statistics – Web Intelligence Network framework. It investigates the use of online apartment listings to analyze the rental market. We used web scraping to extract information from two online real estate portals for flats in the city of Berlin. Using this data, we developed a model to predict rental prices per square meter based on the accommodation’s features and location within the city. We detected offers which appear in both portals by means of statistical matching and removed duplicate offers. Missing values were treated by multiple imputation. The prediction model is a semi-parametric approach where the postal districts are used to describe the location effect. Comparisons with microcensus results and the local rent index reveal significant differences between the market of online flat offers and the stock of existing flat contracts. Interested readers will find the commented programming code in the internet supplement.