Pub Date : 2024-05-24DOI: 10.1016/j.compenvurbsys.2024.102131
Madison Lore , Julia Gabriele Harten , Geoff Boeing
Large-scale text data from public sources, including social media or online platforms, can expand urban planners' ability to monitor and analyze urban conditions in near real-time. To overcome scalability challenges of manual techniques for qualitative data analysis, researchers and practitioners have turned to computer-automated methods, such as natural language processing (NLP) and deep learning. However, the benefits, challenges, and trade-offs of these methods remain poorly understood. How much meaning can different NLP techniques capture and how do their results compare to traditional manual techniques? Drawing on 90,000 online rental listings in Los Angeles County, this study proposes and compares manual, semi-automated, and fully automated methods for identifying context-informed topics in unstructured, user-generated text data. We find that fully automated methods perform best with more-structured text, but struggle to separate topics in free-flow text and when handling nuanced language. Introducing a manual technique first on a small data set to train a semi-automated method, however, improves accuracy even as the structure of the text degrades. We argue that while fully automated NLP methods are attractive replacements for scaling manual techniques, leveraging the contextual understanding of human expertise alongside efficient computer-based methods like BERT models generates better accuracy without sacrificing scalability.
{"title":"A hybrid deep learning method for identifying topics in large-scale urban text data: Benefits and trade-offs","authors":"Madison Lore , Julia Gabriele Harten , Geoff Boeing","doi":"10.1016/j.compenvurbsys.2024.102131","DOIUrl":"https://doi.org/10.1016/j.compenvurbsys.2024.102131","url":null,"abstract":"<div><p>Large-scale text data from public sources, including social media or online platforms, can expand urban planners' ability to monitor and analyze urban conditions in near real-time. To overcome scalability challenges of manual techniques for qualitative data analysis, researchers and practitioners have turned to computer-automated methods, such as natural language processing (NLP) and deep learning. However, the benefits, challenges, and trade-offs of these methods remain poorly understood. How much meaning can different NLP techniques capture and how do their results compare to traditional manual techniques? Drawing on 90,000 online rental listings in Los Angeles County, this study proposes and compares manual, semi-automated, and fully automated methods for identifying context-informed topics in unstructured, user-generated text data. We find that fully automated methods perform best with more-structured text, but struggle to separate topics in free-flow text and when handling nuanced language. Introducing a manual technique first on a small data set to train a semi-automated method, however, improves accuracy even as the structure of the text degrades. We argue that while fully automated NLP methods are attractive replacements for scaling manual techniques, leveraging the contextual understanding of human expertise alongside efficient computer-based methods like BERT models generates better accuracy without sacrificing scalability.</p></div>","PeriodicalId":48241,"journal":{"name":"Computers Environment and Urban Systems","volume":"111 ","pages":"Article 102131"},"PeriodicalIF":6.8,"publicationDate":"2024-05-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0198971524000607/pdfft?md5=9c8f877cb67840528ee457f6a117bb9b&pid=1-s2.0-S0198971524000607-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141095583","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-22DOI: 10.1016/j.compenvurbsys.2024.102130
Xiangkai Zhou , Linlin You , Shuqi Zhong , Ming Cai
Mobile phone network data is a vital source for unveiling human mobility characteristics in accordance with its large-scale spatiotemporal trajectory information. However, mobile phone network data usually records location at the level of cell towers, lacking accurate individual locations. Therefore, the authenticity and credibility of the conclusions drawn from such data are often questioned due to the spatial uncertainty. In this paper, we evaluate the location differences between users and the cell towers during connection establishment. Furthermore, we delve into the representation and contributing factors of spatial uncertainty, including cell tower density, antenna status, and user mobility. Our analysis is based on one-month mobile signaling data and taxi GPS data collected in Foshan (a prefecture-level city in China), which represent two forms of data on the mobility of the same individual. We conclude that to estimate user positions, areas significantly larger than the nearest cell tower are necessary. The influence of tower density and antenna load on connection accuracy does not exhibit a straightforward linear dependency; instead, it fluctuates once a threshold is reached. Connection accuracy is typically higher when users are stationary than when they are in motion. Our findings together indicate that it should carefully assess the accuracy of position estimation when mapping from cell tower location to user location.
{"title":"From cell tower location to user location: Understanding the spatial uncertainty of mobile phone network data in human mobility research","authors":"Xiangkai Zhou , Linlin You , Shuqi Zhong , Ming Cai","doi":"10.1016/j.compenvurbsys.2024.102130","DOIUrl":"https://doi.org/10.1016/j.compenvurbsys.2024.102130","url":null,"abstract":"<div><p>Mobile phone network data is a vital source for unveiling human mobility characteristics in accordance with its large-scale spatiotemporal trajectory information. However, mobile phone network data usually records location at the level of cell towers, lacking accurate individual locations. Therefore, the authenticity and credibility of the conclusions drawn from such data are often questioned due to the spatial uncertainty. In this paper, we evaluate the location differences between users and the cell towers during connection establishment. Furthermore, we delve into the representation and contributing factors of spatial uncertainty, including cell tower density, antenna status, and user mobility. Our analysis is based on one-month mobile signaling data and taxi GPS data collected in Foshan (a prefecture-level city in China), which represent two forms of data on the mobility of the same individual. We conclude that to estimate user positions, areas significantly larger than the nearest cell tower are necessary. The influence of tower density and antenna load on connection accuracy does not exhibit a straightforward linear dependency; instead, it fluctuates once a threshold is reached. Connection accuracy is typically higher when users are stationary than when they are in motion. Our findings together indicate that it should carefully assess the accuracy of position estimation when mapping from cell tower location to user location.</p></div>","PeriodicalId":48241,"journal":{"name":"Computers Environment and Urban Systems","volume":"111 ","pages":"Article 102130"},"PeriodicalIF":6.8,"publicationDate":"2024-05-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141078688","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-18DOI: 10.1016/j.compenvurbsys.2024.102129
Binyu Lei , Pengyuan Liu , Nikola Milojevic-Dupont , Filip Biljecki
Building characteristics, such as number of storeys and type, play a key role across many domains: interpreting urban form, simulating urban microclimate or modelling building energy. However, geospatial data on the building stock is often fragmented and incomplete. Here, we propose a novel and easily adaptable method to predict building characteristics in diverse cities, which attempts to mitigate such data gaps. Our method exploits the geospatial connectivity between street-level urban objects and building characteristics by employing graph neural networks, as they can model spatial relationships and leverage them for predictions. We apply this approach in three representative cities (Boston, Melbourne, and Helsinki) that offer a variety of building features as prediction targets (storeys, types, construction period and materials) and diverse urban environments as predictors. Overall, the magnitude of errors is acceptable for a series of use cases. In the prediction of building storeys, an average of 81.83% buildings in three cities have less than one-storey prediction error. We also find that the prediction of building type achieves an average of 88.33% accuracy across three cities. Meanwhile, an average of 70.5% of buildings are correctly classified by construction period in Melbourne and Helsinki, and the building material prediction accuracy is 68% in Helsinki. The results confirm that our approach is adaptable across different urban environments because comparable performance is achieved in the other two cities. Further, we assess the impact of varying local data availability on model performance. Our findings underscore the feasibility of the method in scenarios with sparse building data (10%, 30% and 50% availability). Our graph-based approach advances research on filling in incomplete building semantics from existing datasets, and showcases the potential to enable 3D city modelling. Given the broad applicability of the approach to predicting many building characteristics, diverse downstream applications exist, such as enhancing contemporary urban studies (e.g. exploring streetscapes) and facilitating the development of 3D GIS (e.g. maintaining and updating 3D building settings).
{"title":"Predicting building characteristics at urban scale using graph neural networks and street-level context","authors":"Binyu Lei , Pengyuan Liu , Nikola Milojevic-Dupont , Filip Biljecki","doi":"10.1016/j.compenvurbsys.2024.102129","DOIUrl":"https://doi.org/10.1016/j.compenvurbsys.2024.102129","url":null,"abstract":"<div><p>Building characteristics, such as number of storeys and type, play a key role across many domains: interpreting urban form, simulating urban microclimate or modelling building energy. However, geospatial data on the building stock is often fragmented and incomplete. Here, we propose a novel and easily adaptable method to predict building characteristics in diverse cities, which attempts to mitigate such data gaps. Our method exploits the geospatial connectivity between street-level urban objects and building characteristics by employing graph neural networks, as they can model spatial relationships and leverage them for predictions. We apply this approach in three representative cities (Boston, Melbourne, and Helsinki) that offer a variety of building features as prediction targets (storeys, types, construction period and materials) and diverse urban environments as predictors. Overall, the magnitude of errors is acceptable for a series of use cases. In the prediction of building storeys, an average of 81.83% buildings in three cities have less than one-storey prediction error. We also find that the prediction of building type achieves an average of 88.33% accuracy across three cities. Meanwhile, an average of 70.5% of buildings are correctly classified by construction period in Melbourne and Helsinki, and the building material prediction accuracy is 68% in Helsinki. The results confirm that our approach is adaptable across different urban environments because comparable performance is achieved in the other two cities. Further, we assess the impact of varying local data availability on model performance. Our findings underscore the feasibility of the method in scenarios with sparse building data (10%, 30% and 50% availability). Our graph-based approach advances research on filling in incomplete building semantics from existing datasets, and showcases the potential to enable 3D city modelling. Given the broad applicability of the approach to predicting many building characteristics, diverse downstream applications exist, such as enhancing contemporary urban studies (e.g. exploring streetscapes) and facilitating the development of 3D GIS (e.g. maintaining and updating 3D building settings).</p></div>","PeriodicalId":48241,"journal":{"name":"Computers Environment and Urban Systems","volume":"111 ","pages":"Article 102129"},"PeriodicalIF":6.8,"publicationDate":"2024-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141068473","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-08DOI: 10.1016/j.compenvurbsys.2024.102119
Mueller Maya , Hoque Simi , Hamil Pearsall
Gentrification is a complex and context-specific process that involves changes in the built environment and social fabric of neighborhoods, often resulting in the displacement of vulnerable communities. Machine Learning (ML) has emerged as a powerful predictive tool that is capable of circumventing the methodological challenges that historically held back researchers from producing reliable forecasts of gentrification. Additionally, computer vision ML algorithms for landscape character assessment, or deep mapping, can now capture a wider range of built metrics related to gentrification-induced redevelopment. These novel ML applications promise to rapidly progress our understandings of gentrification and our capacity to translate academic findings into more productive direction for communities and stakeholders, but with this sudden development comes a steep learning curve. The current paper aims to bridge this divide by providing an overview of recent progress and an actionable template of use that is accessible for researchers across a wide array of academic fields. As a secondary point of emphasis, the review goes over Explainable Artificial Intelligence (XAI) tools for gentrification models and opens up discussion on the nuanced challenges that arise when applying black-box models to human systems. Abstract: Gentrification is a complex and context-specific process that involves changes in the built environment and social fabric of neighborhoods, often resulting in the displacement of vulnerable communities. Machine Learning (ML) has emerged as a powerful predictive tool that is capable of circumventing the methodological challenges that historically held back researchers from producing reliable forecasts of gentrification. Additionally, computer vision ML algorithms for landscape character assessment, or deep mapping, can now capture a wider range of built metrics related to gentrification-induced redevelopment. These novel ML applications promise to rapidly progress our understandings of gentrification and our capacity to translate academic findings into more productive direction for communities and stakeholders, but with this sudden development comes a steep learning curve. The current paper aims to bridge this divide by providing an overview of recent progress and an actionable template of use that is accessible for researchers across a wide array of academic fields. As a secondary point of emphasis, the review goes over Explainable Artificial Intelligence (XAI) tools for gentrification models and opens up discussion on the nuanced challenges that arise when applying black-box models to human systems.
绅士化是一个复杂而又因地制宜的过程,涉及到建筑环境和社区社会结构的变化,往往会导致弱势社区流离失所。机器学习(ML)已成为一种强大的预测工具,它能够规避方法论上的挑战,而这些挑战一直阻碍着研究人员对城市化进行可靠的预测。此外,用于景观特征评估或深度绘图的计算机视觉 ML 算法现在可以捕捉与城市化引起的再开发相关的更广泛的建筑指标。这些新颖的 ML 应用有望迅速增进我们对城市化的理解,并提高我们将学术研究成果转化为对社区和利益相关者更有成效的指导的能力,但伴随着这一突飞猛进的发展而来的是陡峭的学习曲线。本文旨在弥合这一鸿沟,概述了最新进展,并提供了一个可供各学术领域研究人员使用的可操作模板。作为次要重点,本综述介绍了用于城市化模型的可解释人工智能(XAI)工具,并就将黑盒模型应用于人类系统时出现的细微挑战展开了讨论。摘要:"城市化 "是一个复杂而又因地制宜的过程,它涉及建筑环境和社区社会结构的变化,往往会导致弱势社区流离失所。机器学习(ML)已成为一种强大的预测工具,它能够规避方法论上的挑战,而这些挑战一直阻碍着研究人员对城市化进行可靠的预测。此外,用于景观特征评估或深度绘图的计算机视觉 ML 算法现在可以捕捉与城市化引起的再开发相关的更广泛的建筑指标。这些新颖的 ML 应用有望迅速增进我们对城市化的理解,并提高我们将学术研究成果转化为对社区和利益相关者更有成效的指导的能力,但伴随着这一突飞猛进的发展而来的是陡峭的学习曲线。本文旨在弥合这一鸿沟,概述了最新进展,并提供了一个可供各学术领域研究人员使用的可操作模板。作为次要重点,本综述介绍了用于城市化模型的可解释人工智能(XAI)工具,并就将黑箱模型应用于人类系统时出现的细微挑战展开了讨论。
{"title":"Machine learning to model gentrification: A synthesis of emerging forms","authors":"Mueller Maya , Hoque Simi , Hamil Pearsall","doi":"10.1016/j.compenvurbsys.2024.102119","DOIUrl":"https://doi.org/10.1016/j.compenvurbsys.2024.102119","url":null,"abstract":"<div><p>Gentrification is a complex and context-specific process that involves changes in the built environment and social fabric of neighborhoods, often resulting in the displacement of vulnerable communities. Machine Learning (ML) has emerged as a powerful predictive tool that is capable of circumventing the methodological challenges that historically held back researchers from producing reliable forecasts of gentrification. Additionally, computer vision ML algorithms for landscape character assessment, or deep mapping, can now capture a wider range of built metrics related to gentrification-induced redevelopment. These novel ML applications promise to rapidly progress our understandings of gentrification and our capacity to translate academic findings into more productive direction for communities and stakeholders, but with this sudden development comes a steep learning curve. The current paper aims to bridge this divide by providing an overview of recent progress and an actionable template of use that is accessible for researchers across a wide array of academic fields. As a secondary point of emphasis, the review goes over Explainable Artificial Intelligence (XAI) tools for gentrification models and opens up discussion on the nuanced challenges that arise when applying black-box models to human systems. Abstract: Gentrification is a complex and context-specific process that involves changes in the built environment and social fabric of neighborhoods, often resulting in the displacement of vulnerable communities. Machine Learning (ML) has emerged as a powerful predictive tool that is capable of circumventing the methodological challenges that historically held back researchers from producing reliable forecasts of gentrification. Additionally, computer vision ML algorithms for landscape character assessment, or deep mapping, can now capture a wider range of built metrics related to gentrification-induced redevelopment. These novel ML applications promise to rapidly progress our understandings of gentrification and our capacity to translate academic findings into more productive direction for communities and stakeholders, but with this sudden development comes a steep learning curve. The current paper aims to bridge this divide by providing an overview of recent progress and an actionable template of use that is accessible for researchers across a wide array of academic fields. As a secondary point of emphasis, the review goes over Explainable Artificial Intelligence (XAI) tools for gentrification models and opens up discussion on the nuanced challenges that arise when applying black-box models to human systems.</p></div>","PeriodicalId":48241,"journal":{"name":"Computers Environment and Urban Systems","volume":"111 ","pages":"Article 102119"},"PeriodicalIF":6.8,"publicationDate":"2024-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140894742","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-01DOI: 10.1016/j.compenvurbsys.2024.102122
Yonggai Zhuang , Yuhao Kang , Teng Fei , Meng Bian , Yunyan Du
People experience the world through multiple senses simultaneously, contributing to our sense of place. Prior quantitative geography studies have mostly emphasized human visual perceptions, neglecting human auditory perceptions at place due to the challenges in characterizing the acoustic environment vividly. Also, few studies have synthesized the two-dimensional (auditory and visual) perceptions in understanding human sense of place. To bridge these gaps, we propose a Soundscape-to-Image Diffusion model, a generative Artificial Intelligence (AI) model supported by Large Language Models (LLMs), aiming to visualize soundscapes through the generation of street view images. By creating audio-image pairs, acoustic environments are first represented as high-dimensional semantic audio vectors. Our proposed Soundscape-to-Image Diffusion model, which contains a Low-Resolution Diffusion Model and a Super-Resolution Diffusion Model, can then translate those semantic audio vectors into visual representations of place effectively. We evaluated our proposed model by using both machine-based and human-centered approaches. We proved that the generated street view images align with our common perceptions, and accurately create several key street elements of the original soundscapes. It also demonstrates that soundscapes provide sufficient visual information places. This study stands at the forefront of the intersection between generative AI and human geography, demonstrating how human multi-sensory experiences can be linked. We aim to enrich geospatial data science and AI studies with human experiences. It has the potential to inform multiple domains such as human geography, environmental psychology, and urban design and planning, as well as advancing our knowledge of human-environment relationships.
{"title":"From hearing to seeing: Linking auditory and visual place perceptions with soundscape-to-image generative artificial intelligence","authors":"Yonggai Zhuang , Yuhao Kang , Teng Fei , Meng Bian , Yunyan Du","doi":"10.1016/j.compenvurbsys.2024.102122","DOIUrl":"https://doi.org/10.1016/j.compenvurbsys.2024.102122","url":null,"abstract":"<div><p>People experience the world through multiple senses simultaneously, contributing to our sense of place. Prior quantitative geography studies have mostly emphasized human visual perceptions, neglecting human auditory perceptions at place due to the challenges in characterizing the acoustic environment vividly. Also, few studies have synthesized the two-dimensional (auditory and visual) perceptions in understanding human sense of place. To bridge these gaps, we propose a Soundscape-to-Image Diffusion model, a generative Artificial Intelligence (AI) model supported by Large Language Models (LLMs), aiming to visualize soundscapes through the generation of street view images. By creating audio-image pairs, acoustic environments are first represented as high-dimensional semantic audio vectors. Our proposed Soundscape-to-Image Diffusion model, which contains a Low-Resolution Diffusion Model and a Super-Resolution Diffusion Model, can then translate those semantic audio vectors into visual representations of place effectively. We evaluated our proposed model by using both machine-based and human-centered approaches. We proved that the generated street view images align with our common perceptions, and accurately create several key street elements of the original soundscapes. It also demonstrates that soundscapes provide sufficient visual information places. This study stands at the forefront of the intersection between generative AI and human geography, demonstrating how human multi-sensory experiences can be linked. We aim to enrich geospatial data science and AI studies with human experiences. It has the potential to inform multiple domains such as human geography, environmental psychology, and urban design and planning, as well as advancing our knowledge of human-environment relationships.</p></div>","PeriodicalId":48241,"journal":{"name":"Computers Environment and Urban Systems","volume":"110 ","pages":"Article 102122"},"PeriodicalIF":6.8,"publicationDate":"2024-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140816009","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-29DOI: 10.1016/j.compenvurbsys.2024.102120
Guan Huang , Zhan Zhao , A.G.O. Yeh
As an emerging sustainable mobility solution, ridesplitting services match passengers in a similar direction with a single vehicle to reduce fleet size, vehicle kilometers traveled and traffic emissions. However, these benefits can only be achieved with successful matching (sharing) between passengers, which emphasizes the importance of a comprehensive understanding of the matching success rate, i.e., shareability. Despite extensive research into the determinants of shareability, existing literature either relies on simulations and theoretical models with limited empirical validation, or focuses on system-level shareability for the whole market, overlooking the significant spatiotemporal variability of shareability across trips. This study aims to fill these gaps by proposing a path-based model that leverages real-world ridesplitting data to quantify the determinants of shareability at a finer spatiotemporal granularity. Utilizing data from New York City, our results show that: (1) shareability is spatiotemporally heterogeneous; (2) high demand intensity, especially the intensity of medium−/short-distance trips, contributes to greater shareability; (3) the positive contribution of demand intensity diminishes as it increases; (4) a higher road speed improves shareability; (5) excessive one-way street and over-dense street network are related to low shareability. These findings validate and enrich prior findings, which can be used to inform the future development of ridesplitting services.
{"title":"How shareable is your trip? A path-based analysis of ridesplitting trip shareability","authors":"Guan Huang , Zhan Zhao , A.G.O. Yeh","doi":"10.1016/j.compenvurbsys.2024.102120","DOIUrl":"https://doi.org/10.1016/j.compenvurbsys.2024.102120","url":null,"abstract":"<div><p>As an emerging sustainable mobility solution, ridesplitting services match passengers in a similar direction with a single vehicle to reduce fleet size, vehicle kilometers traveled and traffic emissions. However, these benefits can only be achieved with successful matching (sharing) between passengers, which emphasizes the importance of a comprehensive understanding of the matching success rate, i.e., shareability. Despite extensive research into the determinants of shareability, existing literature either relies on simulations and theoretical models with limited empirical validation, or focuses on system-level shareability for the whole market, overlooking the significant spatiotemporal variability of shareability across trips. This study aims to fill these gaps by proposing a path-based model that leverages real-world ridesplitting data to quantify the determinants of shareability at a finer spatiotemporal granularity. Utilizing data from New York City, our results show that: (1) shareability is spatiotemporally heterogeneous; (2) high demand intensity, especially the intensity of medium−/short-distance trips, contributes to greater shareability; (3) the positive contribution of demand intensity diminishes as it increases; (4) a higher road speed improves shareability; (5) excessive one-way street and over-dense street network are related to low shareability. These findings validate and enrich prior findings, which can be used to inform the future development of ridesplitting services.</p></div>","PeriodicalId":48241,"journal":{"name":"Computers Environment and Urban Systems","volume":"110 ","pages":"Article 102120"},"PeriodicalIF":6.8,"publicationDate":"2024-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140807526","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-26DOI: 10.1016/j.compenvurbsys.2024.102121
Cillian Berragan , Alex Singleton , Alessia Calafiore , Jeremy Morley
Observed regional variation in geotagged social media text is often attributed to dialects, where features in language are assumed to exhibit region-specific properties. While dialects are seen as a key component in defining the identity of regions, there are a multitude of other geographic properties that may be captured within natural language text. In our work, we consider locational mentions that are directly embedded within comments on the social media website Reddit, providing a range of associated semantic information, and enabling deeper representations between locations to be captured. Using a large corpus of geoparsed Reddit comments from UK-related local discussion subreddits, we first extract embedded semantic information using a large language model, aggregated into local authority districts, representing the semantic footprint of these regions. These footprints broadly exhibit spatial autocorrelation, with clusters that conform with the national borders of Wales and Scotland. London, Wales, and Scotland also demonstrate notably different semantic footprints compared with the rest of Great Britain.
{"title":"Mapping Great Britain's semantic footprints through a large language model analysis of Reddit comments","authors":"Cillian Berragan , Alex Singleton , Alessia Calafiore , Jeremy Morley","doi":"10.1016/j.compenvurbsys.2024.102121","DOIUrl":"https://doi.org/10.1016/j.compenvurbsys.2024.102121","url":null,"abstract":"<div><p>Observed regional variation in geotagged social media text is often attributed to dialects, where features in language are assumed to exhibit region-specific properties. While dialects are seen as a key component in defining the identity of regions, there are a multitude of other geographic properties that may be captured within natural language text. In our work, we consider locational mentions that are directly embedded within comments on the social media website Reddit, providing a range of associated semantic information, and enabling deeper representations between locations to be captured. Using a large corpus of geoparsed Reddit comments from UK-related local discussion subreddits, we first extract embedded semantic information using a large language model, aggregated into local authority districts, representing the semantic footprint of these regions. These footprints broadly exhibit spatial autocorrelation, with clusters that conform with the national borders of Wales and Scotland. London, Wales, and Scotland also demonstrate notably different semantic footprints compared with the rest of Great Britain.</p></div>","PeriodicalId":48241,"journal":{"name":"Computers Environment and Urban Systems","volume":"110 ","pages":"Article 102121"},"PeriodicalIF":6.8,"publicationDate":"2024-04-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0198971524000504/pdfft?md5=ea3c1ade10d7db227e51de2d2551f34b&pid=1-s2.0-S0198971524000504-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140649792","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-25DOI: 10.1016/j.compenvurbsys.2024.102114
Jianxin Qin , Lu Wang , Tao Wu , Ye Li , Longgang Xiang , Yuanyuan Zhu
The growing ubiquity of location/activity sensing technologies has created unprecedented opportunities for research on human spatiotemporal interaction behavior in mobile environments. However, existing studies of human mobility need to sufficiently account for the association of indoor scenes with the semantics of human behavior. This paper introduces TSTM-in, a trajectory model that combines trajectory data and indoor scenes using topological semantic modeling, semantic trajectory reconstruction, and trajectory queries. The model effectively manages indoor semantic trajectory data and extracts topological behavioral semantics by incorporating important points across a trajectory to reflect the semantics of key points connected to indoor corridors and regions. These topological semantics facilitate the creation of a flexible intersection-based indoor semantic trajectory reconstruction. Reconstructed semantic trajectories represent human mobility by integrating semantic data sets along the time axis. A case study with real-world trajectory queries from travelers demonstrates the model's effectiveness. TSTM-in realizes the association of indoor scenes with human behavior semantics, supporting the construction of mobile object management applications for indoor scenes and providing scientific and reasonable spatiotemporal semantic information description for location service-based intelligent cities.
{"title":"Indoor mobility data encoding with TSTM-in: A topological-semantic trajectory model","authors":"Jianxin Qin , Lu Wang , Tao Wu , Ye Li , Longgang Xiang , Yuanyuan Zhu","doi":"10.1016/j.compenvurbsys.2024.102114","DOIUrl":"https://doi.org/10.1016/j.compenvurbsys.2024.102114","url":null,"abstract":"<div><p>The growing ubiquity of location/activity sensing technologies has created unprecedented opportunities for research on human spatiotemporal interaction behavior in mobile environments. However, existing studies of human mobility need to sufficiently account for the association of indoor scenes with the semantics of human behavior. This paper introduces TSTM-in, a trajectory model that combines trajectory data and indoor scenes using topological semantic modeling, semantic trajectory reconstruction, and trajectory queries. The model effectively manages indoor semantic trajectory data and extracts topological behavioral semantics by incorporating important points across a trajectory to reflect the semantics of key points connected to indoor corridors and regions. These topological semantics facilitate the creation of a flexible intersection-based indoor semantic trajectory reconstruction. Reconstructed semantic trajectories represent human mobility by integrating semantic data sets along the time axis. A case study with real-world trajectory queries from travelers demonstrates the model's effectiveness. TSTM-in realizes the association of indoor scenes with human behavior semantics, supporting the construction of mobile object management applications for indoor scenes and providing scientific and reasonable spatiotemporal semantic information description for location service-based intelligent cities.</p></div>","PeriodicalId":48241,"journal":{"name":"Computers Environment and Urban Systems","volume":"110 ","pages":"Article 102114"},"PeriodicalIF":6.8,"publicationDate":"2024-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140647373","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-25DOI: 10.1016/j.compenvurbsys.2024.102116
Elijah Knaap, Sergio Rey
In this paper we examine the evolution of urban spatial structure in U.S. metropolitan areas over nearly two decades. Using annual block-level data from the Longitudinal Employment Household Dynamics database, we introduce a technique for identifying regional employment centers that both adheres to urban economic theory and pays homage to classic contributions in local spatial statistics. Centers are defined as local spatial statistical outliers on the network-based job accessibility surface. We proceed by identifying the location and employment makeup of centers for each metropolitan region in the USA from 2002 to 2019 and discuss emergent trends across time and space. Critically, we not only explore empirical patterns, but we discuss the relationship between polycentricity, the evolution of urbanization and localization economies, and regional specialization. We confirm again the pattern of polycentricity in U.S. metros and show that the structure of metropolitan employment is largely stable over time. We also document a continuing trend away from urbanization economies into more specialized subcenters.
{"title":"Measuring two decades of urban spatial structure: The evolution of agglomeration economies in American metros","authors":"Elijah Knaap, Sergio Rey","doi":"10.1016/j.compenvurbsys.2024.102116","DOIUrl":"https://doi.org/10.1016/j.compenvurbsys.2024.102116","url":null,"abstract":"<div><p>In this paper we examine the evolution of urban spatial structure in U.S. metropolitan areas over nearly two decades. Using annual block-level data from the Longitudinal Employment Household Dynamics database, we introduce a technique for identifying regional employment centers that both adheres to urban economic theory and pays homage to classic contributions in local spatial statistics. Centers are defined as local spatial statistical outliers on the network-based job accessibility surface. We proceed by identifying the location and employment makeup of centers for each metropolitan region in the USA from 2002 to 2019 and discuss emergent trends across time and space. Critically, we not only explore empirical patterns, but we discuss the relationship between polycentricity, the evolution of urbanization and localization economies, and regional specialization. We confirm again the pattern of polycentricity in U.S. metros and show that the structure of metropolitan employment is largely stable over time. We also document a continuing trend away from urbanization economies into more specialized subcenters.</p></div>","PeriodicalId":48241,"journal":{"name":"Computers Environment and Urban Systems","volume":"110 ","pages":"Article 102116"},"PeriodicalIF":6.8,"publicationDate":"2024-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0198971524000450/pdfft?md5=9fd6287b175eb18342b8ee2c1892ab5d&pid=1-s2.0-S0198971524000450-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140643636","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-22DOI: 10.1016/j.compenvurbsys.2024.102118
Cassiano Bastos Moroz, Tobias Sieg, Annegret H. Thieken
Spatial constraints are fundamental to integrating the spatial suitability to urbanization into Cellular Automata-based (CA) urban growth models, but there is a lack of consensus on the optimal methods for this purpose. This study compared the performance of three probabilistic classifiers to generate suitability surfaces for CA-based urban growth models: Logistic Regression using Generalized Linear Model (LR-GLM), Logistic Regression using Generalized Additive Model (LR-GAM), and Random Forest (RF). The study also evaluated the sensitivity of these classifiers to the input urban map adopted as a dependent variable. For this analysis, seven maps were tested: the historical urban map containing the entire extent of the urban footprint, and six additional maps containing only the recently urbanized areas over timeframes ranging from one year up to two decades. The comparison evaluated the goodness of fit of the suitability surfaces and the spatial accuracy of the urban growth simulations, using five large Brazilian cities as case study areas. The results revealed that the RF classifier significantly outperformed the LR-based classifiers. However, this overperformance was more prominent when incorporating the new urban cells over the last one to two decades of growth as input urban maps. In addition, the sensitivity analysis of the input urban maps emphasized the benefits of calibrating the classifier using the recently urbanized cells rather than the historical urban extent. We consistently observed these results concerning classifiers and input urban maps across all five case study areas. Thus, the RF classifier combined with a training dataset containing the newly urbanized areas over at least the last 10 years systematically resulted in the suitability surfaces with the highest predictability among all tested scenarios.
空间约束是将城市化空间适宜性纳入基于蜂窝自动机(CA)的城市增长模型的基本要素,但对于实现这一目的的最佳方法还缺乏共识。本研究比较了三种概率分类器的性能,以便为基于蜂窝自动机的城市增长模型生成适宜性曲面:使用广义线性模型的逻辑回归(LR-GLM)、使用广义加法模型的逻辑回归(LR-GAM)和随机森林(RF)。研究还评估了这些分类器对作为因变量的输入城市地图的敏感性。在这项分析中,测试了七张地图:包含整个城市足迹范围的历史城市地图,以及另外六张仅包含最近城市化地区的地图,时间范围从一年到二十年不等。比较以巴西五个大城市为案例研究区域,评估了适宜性表面的拟合度和城市增长模拟的空间准确性。结果显示,射频分类器的性能明显优于基于 LR 的分类器。然而,当将过去一二十年发展中的新城市单元作为输入城市地图时,这种超常表现更为突出。此外,对输入城市地图的敏感性分析强调了使用最近城市化的小区而不是历史城市范围来校准分类器的好处。在所有五个案例研究区域中,我们始终观察到这些有关分类器和输入城市地图的结果。因此,射频分类器与包含至少过去 10 年新城市化区域的训练数据集相结合,系统地生成了所有测试方案中预测性最高的适宜性表面。
{"title":"Spatial constraints in cellular automata-based urban growth models: A systematic comparison of classifiers and input urban maps","authors":"Cassiano Bastos Moroz, Tobias Sieg, Annegret H. Thieken","doi":"10.1016/j.compenvurbsys.2024.102118","DOIUrl":"https://doi.org/10.1016/j.compenvurbsys.2024.102118","url":null,"abstract":"<div><p>Spatial constraints are fundamental to integrating the spatial suitability to urbanization into Cellular Automata-based (CA) urban growth models, but there is a lack of consensus on the optimal methods for this purpose. This study compared the performance of three probabilistic classifiers to generate suitability surfaces for CA-based urban growth models: Logistic Regression using Generalized Linear Model (LR-GLM), Logistic Regression using Generalized Additive Model (LR-GAM), and Random Forest (RF). The study also evaluated the sensitivity of these classifiers to the input urban map adopted as a dependent variable. For this analysis, seven maps were tested: the historical urban map containing the entire extent of the urban footprint, and six additional maps containing only the recently urbanized areas over timeframes ranging from one year up to two decades. The comparison evaluated the goodness of fit of the suitability surfaces and the spatial accuracy of the urban growth simulations, using five large Brazilian cities as case study areas. The results revealed that the RF classifier significantly outperformed the LR-based classifiers. However, this overperformance was more prominent when incorporating the new urban cells over the last one to two decades of growth as input urban maps. In addition, the sensitivity analysis of the input urban maps emphasized the benefits of calibrating the classifier using the recently urbanized cells rather than the historical urban extent. We consistently observed these results concerning classifiers and input urban maps across all five case study areas. Thus, the RF classifier combined with a training dataset containing the newly urbanized areas over at least the last 10 years systematically resulted in the suitability surfaces with the highest predictability among all tested scenarios.</p></div>","PeriodicalId":48241,"journal":{"name":"Computers Environment and Urban Systems","volume":"110 ","pages":"Article 102118"},"PeriodicalIF":6.8,"publicationDate":"2024-04-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140632947","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}