利用常规收集的数据预测人群中孕妇的脆弱性，以及自我报告数据的附加意义。

IF 3.7 3区医学 Q1 PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH European Journal of Public Health Pub Date : 2024-12-01 DOI:10.1093/eurpub/ckae184

Joyce M Molenaar, Ka Yin Leung, Lindsey van der Meer, Peter Paul F Klein, Jeroen N Struijs, Jessica C Kiefte-de Jong

{"title":"利用常规收集的数据预测人群中孕妇的脆弱性，以及自我报告数据的附加意义。","authors":"Joyce M Molenaar, Ka Yin Leung, Lindsey van der Meer, Peter Paul F Klein, Jeroen N Struijs, Jessica C Kiefte-de Jong","doi":"10.1093/eurpub/ckae184","DOIUrl":null,"url":null,"abstract":"Recognizing and addressing vulnerability during the first thousand days of life can prevent health inequities. It is necessary to determine the best data for predicting multidimensional vulnerability (i.e. risk factors to vulnerability across different domains and a lack of protective factors) at population level to understand national prevalence and trends. This study aimed to (1) assess the feasibility of predicting multidimensional vulnerability during pregnancy using routinely collected data, (2) explore potential improvement of these predictions by adding self-reported data on health, well-being, and lifestyle, and (3) identify the most relevant predictors. The study was conducted using Dutch nationwide routinely collected data and self-reported Public Health Monitor data. First, to predict multidimensional vulnerability using routinely collected data, we used random forest (RF) and considered the area under the curve (AUC) and F1 measure to assess RF model performance. To validate results, sensitivity analyses (XGBoost and Lasso) were done. Second, we gradually added self-reported data to predictions. Third, we explored the RF model's variable importance. The initial RF model could distinguish between those with and without multidimensional vulnerability (AUC = 0.98). The model was able to correctly predict multidimensional vulnerability in most cases, but there was also misclassification (F1 measure = 0.70). Adding self-reported data improved RF model performance (e.g. F1 measure = 0.80 after adding perceived health). The strongest predictors concerned self-reported health, socioeconomic characteristics, and healthcare expenditures and utilization. It seems possible to predict multidimensional vulnerability using routinely collected data that is readily available. However, adding self-reported data can improve predictions.","PeriodicalId":12059,"journal":{"name":"European Journal of Public Health","volume":" ","pages":"1210-1217"},"PeriodicalIF":3.7000,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11631480/pdf/","citationCount":"0","resultStr":"{\"title\":\"Predicting population-level vulnerability among pregnant women using routinely collected data and the added relevance of self-reported data.\",\"authors\":\"Joyce M Molenaar, Ka Yin Leung, Lindsey van der Meer, Peter Paul F Klein, Jeroen N Struijs, Jessica C Kiefte-de Jong\",\"doi\":\"10.1093/eurpub/ckae184\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Recognizing and addressing vulnerability during the first thousand days of life can prevent health inequities. It is necessary to determine the best data for predicting multidimensional vulnerability (i.e. risk factors to vulnerability across different domains and a lack of protective factors) at population level to understand national prevalence and trends. This study aimed to (1) assess the feasibility of predicting multidimensional vulnerability during pregnancy using routinely collected data, (2) explore potential improvement of these predictions by adding self-reported data on health, well-being, and lifestyle, and (3) identify the most relevant predictors. The study was conducted using Dutch nationwide routinely collected data and self-reported Public Health Monitor data. First, to predict multidimensional vulnerability using routinely collected data, we used random forest (RF) and considered the area under the curve (AUC) and F1 measure to assess RF model performance. To validate results, sensitivity analyses (XGBoost and Lasso) were done. Second, we gradually added self-reported data to predictions. Third, we explored the RF model's variable importance. The initial RF model could distinguish between those with and without multidimensional vulnerability (AUC = 0.98). The model was able to correctly predict multidimensional vulnerability in most cases, but there was also misclassification (F1 measure = 0.70). Adding self-reported data improved RF model performance (e.g. F1 measure = 0.80 after adding perceived health). The strongest predictors concerned self-reported health, socioeconomic characteristics, and healthcare expenditures and utilization. It seems possible to predict multidimensional vulnerability using routinely collected data that is readily available. However, adding self-reported data can improve predictions.\",\"PeriodicalId\":12059,\"journal\":{\"name\":\"European Journal of Public Health\",\"volume\":\" \",\"pages\":\"1210-1217\"},\"PeriodicalIF\":3.7000,\"publicationDate\":\"2024-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11631480/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"European Journal of Public Health\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1093/eurpub/ckae184\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"European Journal of Public Health","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1093/eurpub/ckae184","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH","Score":null,"Total":0}

引用次数: 0

摘要

认识并解决生命最初一千天的脆弱性问题可以防止健康不平等。有必要确定在人口层面预测多维脆弱性（即不同领域脆弱性的风险因素和缺乏保护因素）的最佳数据，以了解全国的流行率和趋势。这项研究的目的是：（1）评估利用常规收集的数据预测孕期多维脆弱性的可行性；（2）探讨通过添加有关健康、幸福和生活方式的自我报告数据来改进这些预测的可能性；以及（3）确定最相关的预测因素。该研究利用荷兰全国范围内的常规收集数据和自我报告的公共卫生监测数据进行。首先，为了利用常规收集的数据预测多维脆弱性，我们使用了随机森林（RF），并考虑了曲线下面积（AUC）和 F1 测量来评估 RF 模型的性能。为了验证结果，我们进行了敏感性分析（XGBoost 和 Lasso）。其次，我们逐步将自我报告数据添加到预测中。第三，我们探索了 RF 模型的变量重要性。最初的 RF 模型可以区分有多维脆弱性和无多维脆弱性的人群（AUC = 0.98）。该模型在大多数情况下都能正确预测多维脆弱性，但也存在误分类（F1 测量 = 0.70）。加入自我报告数据后，RF 模型的性能有所提高（例如，加入健康感知后，F1 值 = 0.80）。最强的预测因素涉及自我报告的健康状况、社会经济特征以及医疗支出和使用情况。利用日常收集的现成数据来预测多维脆弱性似乎是可行的。不过，增加自我报告数据可以提高预测效果。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Predicting population-level vulnerability among pregnant women using routinely collected data and the added relevance of self-reported data.

Recognizing and addressing vulnerability during the first thousand days of life can prevent health inequities. It is necessary to determine the best data for predicting multidimensional vulnerability (i.e. risk factors to vulnerability across different domains and a lack of protective factors) at population level to understand national prevalence and trends. This study aimed to (1) assess the feasibility of predicting multidimensional vulnerability during pregnancy using routinely collected data, (2) explore potential improvement of these predictions by adding self-reported data on health, well-being, and lifestyle, and (3) identify the most relevant predictors. The study was conducted using Dutch nationwide routinely collected data and self-reported Public Health Monitor data. First, to predict multidimensional vulnerability using routinely collected data, we used random forest (RF) and considered the area under the curve (AUC) and F1 measure to assess RF model performance. To validate results, sensitivity analyses (XGBoost and Lasso) were done. Second, we gradually added self-reported data to predictions. Third, we explored the RF model's variable importance. The initial RF model could distinguish between those with and without multidimensional vulnerability (AUC = 0.98). The model was able to correctly predict multidimensional vulnerability in most cases, but there was also misclassification (F1 measure = 0.70). Adding self-reported data improved RF model performance (e.g. F1 measure = 0.80 after adding perceived health). The strongest predictors concerned self-reported health, socioeconomic characteristics, and healthcare expenditures and utilization. It seems possible to predict multidimensional vulnerability using routinely collected data that is readily available. However, adding self-reported data can improve predictions.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

European Journal of Public Health 医学-公共卫生、环境卫生与职业卫生

CiteScore

5.60

自引率

2.30%

发文量

2039

审稿时长

3-8 weeks

期刊介绍： The European Journal of Public Health (EJPH) is a multidisciplinary journal aimed at attracting contributions from epidemiology, health services research, health economics, social sciences, management sciences, ethics and law, environmental health sciences, and other disciplines of relevance to public health. The journal provides a forum for discussion and debate of current international public health issues, with a focus on the European Region. Bi-monthly issues contain peer-reviewed original articles, editorials, commentaries, book reviews, news, letters to the editor, announcements of events, and various other features.