Ortis Yankey, Chigozie E. Utazi, Christopher C. Nnanatu, Assane N. Gadiaga, Thomas Abbot, Attila N. Lazar, Andrew J. Tatem
{"title":"利用贝叶斯加性回归树模型分解人口普查数据以绘制人口分布图","authors":"Ortis Yankey, Chigozie E. Utazi, Christopher C. Nnanatu, Assane N. Gadiaga, Thomas Abbot, Attila N. Lazar, Andrew J. Tatem","doi":"10.1016/j.apgeog.2024.103416","DOIUrl":null,"url":null,"abstract":"<div><p>Population data is crucial for policy decisions, but fine-scale population numbers are often lacking due to the challenge of sharing sensitive data. Different approaches, such as the use of the Random Forest (RF) model, have been used to disaggregate census data from higher administrative units to small area scales. A major limitation of the RF model is its inability to quantify the uncertainties associated with the predicted populations, which can be important for policy decisions. In this study, we applied a Bayesian Additive Regression Tree (BART) model for population disaggregation and compared the result with a RF model using both simulated data and the 2021 census data for Ghana. The BART model consistently outperforms the RF model in out-of-sample predictions for all metrics, such as bias, mean squared error (MSE), and root mean squared error (RMSE). The BART model also addresses the limitations of the RF model by providing uncertainty estimates around the predicted population, which is often lacking with the RF model. Overall, the study demonstrates the superiority of the BART model over the RF model in disaggregating population data and highlights its potential for gridded population estimates.</p></div>","PeriodicalId":48396,"journal":{"name":"Applied Geography","volume":"172 ","pages":"Article 103416"},"PeriodicalIF":4.0000,"publicationDate":"2024-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0143622824002212/pdfft?md5=44f880423c98386303972ce33803cc16&pid=1-s2.0-S0143622824002212-main.pdf","citationCount":"0","resultStr":"{\"title\":\"Disaggregating census data for population mapping using a Bayesian Additive Regression Tree model\",\"authors\":\"Ortis Yankey, Chigozie E. Utazi, Christopher C. Nnanatu, Assane N. Gadiaga, Thomas Abbot, Attila N. Lazar, Andrew J. Tatem\",\"doi\":\"10.1016/j.apgeog.2024.103416\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Population data is crucial for policy decisions, but fine-scale population numbers are often lacking due to the challenge of sharing sensitive data. Different approaches, such as the use of the Random Forest (RF) model, have been used to disaggregate census data from higher administrative units to small area scales. A major limitation of the RF model is its inability to quantify the uncertainties associated with the predicted populations, which can be important for policy decisions. In this study, we applied a Bayesian Additive Regression Tree (BART) model for population disaggregation and compared the result with a RF model using both simulated data and the 2021 census data for Ghana. The BART model consistently outperforms the RF model in out-of-sample predictions for all metrics, such as bias, mean squared error (MSE), and root mean squared error (RMSE). The BART model also addresses the limitations of the RF model by providing uncertainty estimates around the predicted population, which is often lacking with the RF model. Overall, the study demonstrates the superiority of the BART model over the RF model in disaggregating population data and highlights its potential for gridded population estimates.</p></div>\",\"PeriodicalId\":48396,\"journal\":{\"name\":\"Applied Geography\",\"volume\":\"172 \",\"pages\":\"Article 103416\"},\"PeriodicalIF\":4.0000,\"publicationDate\":\"2024-09-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S0143622824002212/pdfft?md5=44f880423c98386303972ce33803cc16&pid=1-s2.0-S0143622824002212-main.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Applied Geography\",\"FirstCategoryId\":\"89\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0143622824002212\",\"RegionNum\":2,\"RegionCategory\":\"地球科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"GEOGRAPHY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Geography","FirstCategoryId":"89","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0143622824002212","RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"GEOGRAPHY","Score":null,"Total":0}
Disaggregating census data for population mapping using a Bayesian Additive Regression Tree model
Population data is crucial for policy decisions, but fine-scale population numbers are often lacking due to the challenge of sharing sensitive data. Different approaches, such as the use of the Random Forest (RF) model, have been used to disaggregate census data from higher administrative units to small area scales. A major limitation of the RF model is its inability to quantify the uncertainties associated with the predicted populations, which can be important for policy decisions. In this study, we applied a Bayesian Additive Regression Tree (BART) model for population disaggregation and compared the result with a RF model using both simulated data and the 2021 census data for Ghana. The BART model consistently outperforms the RF model in out-of-sample predictions for all metrics, such as bias, mean squared error (MSE), and root mean squared error (RMSE). The BART model also addresses the limitations of the RF model by providing uncertainty estimates around the predicted population, which is often lacking with the RF model. Overall, the study demonstrates the superiority of the BART model over the RF model in disaggregating population data and highlights its potential for gridded population estimates.
期刊介绍:
Applied Geography is a journal devoted to the publication of research which utilizes geographic approaches (human, physical, nature-society and GIScience) to resolve human problems that have a spatial dimension. These problems may be related to the assessment, management and allocation of the world physical and/or human resources. The underlying rationale of the journal is that only through a clear understanding of the relevant societal, physical, and coupled natural-humans systems can we resolve such problems. Papers are invited on any theme involving the application of geographical theory and methodology in the resolution of human problems.