Using Machine Learning to Evaluate Real Estate Prices Using Location Big Data

W. Coleman, Ben Johann, Nicholas Pasternak, Jaya Vellayan, N. Foutz, Heman Shakeri
{"title":"Using Machine Learning to Evaluate Real Estate Prices Using Location Big Data","authors":"W. Coleman, Ben Johann, Nicholas Pasternak, Jaya Vellayan, N. Foutz, Heman Shakeri","doi":"10.48550/arXiv.2205.01180","DOIUrl":null,"url":null,"abstract":"With everyone trying to enter the real estate market nowadays, knowing the proper valuations for residential and commercial properties has become crucial. Past researchers have been known to utilize static real estate data (e.g, number of beds, baths, square footage) or even a combination of real estate and demographic information to predict property prices. In this investigation, we attempted to improve upon past research. So we decided to explore a unique approach - we wanted to determine if mobile location data could be used to improve the predictive power of popular regression and tree-based models. To prepare our data for our models, we processed the mobility data by attaching it to individual properties from the real estate data that aggregated users within 500 meters of the property for each day of the week. We removed people that lived within 500 meters of each property, so each property's aggregated mobility data only contained non-resident census features. On top of these dynamic census features, we also included static census features, including the number of people in the area, the average proportion of people commuting, and the number of residents in the area. Finally, we tested multiple models to predict real estate prices. Our proposed model is two stacked random forest modules combined using a ridge regression that uses the random forest outputs as predictors. The first random forest model used static features only and the second random forest model used dynamic features only. Comparing our models with and without the dynamic mobile location features concludes the model with dynamic mobile location features achieves 3 % lower mean squared error than the same model but without dynamic mobile location features.","PeriodicalId":286724,"journal":{"name":"2022 Systems and Information Engineering Design Symposium (SIEDS)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 Systems and Information Engineering Design Symposium (SIEDS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.48550/arXiv.2205.01180","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

With everyone trying to enter the real estate market nowadays, knowing the proper valuations for residential and commercial properties has become crucial. Past researchers have been known to utilize static real estate data (e.g, number of beds, baths, square footage) or even a combination of real estate and demographic information to predict property prices. In this investigation, we attempted to improve upon past research. So we decided to explore a unique approach - we wanted to determine if mobile location data could be used to improve the predictive power of popular regression and tree-based models. To prepare our data for our models, we processed the mobility data by attaching it to individual properties from the real estate data that aggregated users within 500 meters of the property for each day of the week. We removed people that lived within 500 meters of each property, so each property's aggregated mobility data only contained non-resident census features. On top of these dynamic census features, we also included static census features, including the number of people in the area, the average proportion of people commuting, and the number of residents in the area. Finally, we tested multiple models to predict real estate prices. Our proposed model is two stacked random forest modules combined using a ridge regression that uses the random forest outputs as predictors. The first random forest model used static features only and the second random forest model used dynamic features only. Comparing our models with and without the dynamic mobile location features concludes the model with dynamic mobile location features achieves 3 % lower mean squared error than the same model but without dynamic mobile location features.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
利用位置大数据利用机器学习评估房地产价格
如今,每个人都试图进入房地产市场,了解住宅和商业物业的适当估值变得至关重要。过去的研究人员已经知道利用静态房地产数据(例如,床的数量,浴室,平方英尺),甚至房地产和人口统计信息的组合来预测房地产价格。在这次调查中,我们试图改进过去的研究。因此,我们决定探索一种独特的方法——我们想确定移动位置数据是否可以用来提高流行的回归和基于树的模型的预测能力。为了为我们的模型准备数据,我们通过将移动数据附加到房地产数据中的各个属性来处理移动数据,这些房地产数据汇总了一周中每天在该属性500米内的用户。我们剔除了居住在每套房产500米范围内的人,因此每套房产的汇总流动性数据只包含非居民人口普查特征。在这些动态人口普查特征的基础上,我们还纳入了静态人口普查特征,包括该地区的人口数量、通勤人口的平均比例和该地区的居民数量。最后,我们测试了多个模型来预测房地产价格。我们提出的模型是两个堆叠的随机森林模块,使用脊回归将随机森林输出作为预测因子。第一个随机森林模型只使用静态特征,第二个随机森林模型只使用动态特征。将我们的模型与不含动态移动位置特征的模型进行比较,发现含动态移动位置特征的模型比不含动态移动位置特征的模型均方误差低3%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
The Linville Creek Bridge: A Case Study of Design Thinking in Structural Engineering Convergence Across Behavioral and Self-report Measures Evaluating Individuals' Trust in an Autonomous Golf Cart Investigating the Illicit Trade of Cultural Property with an Automated Data Pipeline Architecture Investigating Disinformation Through the Lens of Mass Media: A System Design Dynamic Coal Production Line: Plant Design and Analysis Tool
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1