用预测算法改进采样概率定义

IF 1.1 3区 社会学 Q2 ANTHROPOLOGY Field Methods Pub Date : 2022-09-15 DOI:10.1177/1525822X221113181
Matthew Jannetti, A. Carroll-Scott, Erikka Gilliam, Irene E. Headen, Maggie Beverly, F. Lê-Scherban
{"title":"用预测算法改进采样概率定义","authors":"Matthew Jannetti, A. Carroll-Scott, Erikka Gilliam, Irene E. Headen, Maggie Beverly, F. Lê-Scherban","doi":"10.1177/1525822X221113181","DOIUrl":null,"url":null,"abstract":"Place-based initiatives often use resident surveys to inform and evaluate interventions. Sampling based on well-defined sampling frames is important but challenging for initiatives that target subpopulations. Databases that enumerate total population counts can produce overinclusive sampling frames, resulting in costly outreach to ineligible participants. Quantifying eligibility before sampling using machine learning algorithms can improve efficiency and reduce costs. We developed a model to improve sampling for the West Philly Promise Neighborhood’s biennial population-representative survey of households with children within a geographic footprint. This study proposes a method to estimate probability of study eligibility by building a well-calibrated predictive model using existing administrative data sources. Six machine-learning models were evaluated; logistic regression provided the best balance of accuracy and understandable probabilities. This approach can be a blueprint for other population-based studies whose sampling frames cannot be well defined using traditional sources.","PeriodicalId":48060,"journal":{"name":"Field Methods","volume":null,"pages":null},"PeriodicalIF":1.1000,"publicationDate":"2022-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Improving Sampling Probability Definitions with Predictive Algorithms\",\"authors\":\"Matthew Jannetti, A. Carroll-Scott, Erikka Gilliam, Irene E. Headen, Maggie Beverly, F. Lê-Scherban\",\"doi\":\"10.1177/1525822X221113181\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Place-based initiatives often use resident surveys to inform and evaluate interventions. Sampling based on well-defined sampling frames is important but challenging for initiatives that target subpopulations. Databases that enumerate total population counts can produce overinclusive sampling frames, resulting in costly outreach to ineligible participants. Quantifying eligibility before sampling using machine learning algorithms can improve efficiency and reduce costs. We developed a model to improve sampling for the West Philly Promise Neighborhood’s biennial population-representative survey of households with children within a geographic footprint. This study proposes a method to estimate probability of study eligibility by building a well-calibrated predictive model using existing administrative data sources. Six machine-learning models were evaluated; logistic regression provided the best balance of accuracy and understandable probabilities. This approach can be a blueprint for other population-based studies whose sampling frames cannot be well defined using traditional sources.\",\"PeriodicalId\":48060,\"journal\":{\"name\":\"Field Methods\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":1.1000,\"publicationDate\":\"2022-09-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Field Methods\",\"FirstCategoryId\":\"90\",\"ListUrlMain\":\"https://doi.org/10.1177/1525822X221113181\",\"RegionNum\":3,\"RegionCategory\":\"社会学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ANTHROPOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Field Methods","FirstCategoryId":"90","ListUrlMain":"https://doi.org/10.1177/1525822X221113181","RegionNum":3,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ANTHROPOLOGY","Score":null,"Total":0}
引用次数: 1

摘要

基于地点的举措通常使用居民调查来告知和评估干预措施。基于定义明确的采样框架的采样很重要,但对于针对亚群体的举措来说具有挑战性。列举总人口计数的数据库可能会产生过多的抽样框架,导致对不合格参与者的推广成本高昂。使用机器学习算法在采样前量化合格性可以提高效率并降低成本。我们为West Philly Promise Neighborhood两年一次的人口代表性调查开发了一个模型,以改进抽样,该调查针对地理足迹内有孩子的家庭。本研究提出了一种方法,通过使用现有的管理数据源建立一个校准良好的预测模型来估计研究合格的概率。评估了六个机器学习模型;逻辑回归提供了准确性和可理解概率之间的最佳平衡。这种方法可以作为其他基于人群的研究的蓝图,这些研究的采样框架无法使用传统来源很好地定义。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Improving Sampling Probability Definitions with Predictive Algorithms
Place-based initiatives often use resident surveys to inform and evaluate interventions. Sampling based on well-defined sampling frames is important but challenging for initiatives that target subpopulations. Databases that enumerate total population counts can produce overinclusive sampling frames, resulting in costly outreach to ineligible participants. Quantifying eligibility before sampling using machine learning algorithms can improve efficiency and reduce costs. We developed a model to improve sampling for the West Philly Promise Neighborhood’s biennial population-representative survey of households with children within a geographic footprint. This study proposes a method to estimate probability of study eligibility by building a well-calibrated predictive model using existing administrative data sources. Six machine-learning models were evaluated; logistic regression provided the best balance of accuracy and understandable probabilities. This approach can be a blueprint for other population-based studies whose sampling frames cannot be well defined using traditional sources.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Field Methods
Field Methods Multiple-
CiteScore
2.70
自引率
5.90%
发文量
41
期刊介绍: Field Methods (formerly Cultural Anthropology Methods) is devoted to articles about the methods used by field wzorkers in the social and behavioral sciences and humanities for the collection, management, and analysis data about human thought and/or human behavior in the natural world. Articles should focus on innovations and issues in the methods used, rather than on the reporting of research or theoretical/epistemological questions about research. High-quality articles using qualitative and quantitative methods-- from scientific or interpretative traditions-- dealing with data collection and analysis in applied and scholarly research from writers in the social sciences, humanities, and related professions are all welcome in the pages of the journal.
期刊最新文献
ChatGPTest: Opportunities and Cautionary Tales of Utilizing AI for Questionnaire Pretesting What predicts willingness to participate in a follow-up panel study among respondents to a national web/mail survey? Invited Review: Collecting Data through Dyadic Interviews: A Systematic Review Offering Web Response as a Refusal Conversion Technique in a Mixed-mode Survey Network of Categories: A Method to Aggregate Egocentric Network Survey Data into a Whole Network Structure
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1