从网络中抽样:受访者驱动的抽样

Q3 Mathematics Epidemiologic Methods Pub Date : 2020-02-13 DOI:10.1515/em-2020-0033
Mamadou Yauck, E. Moodie, Herak Apelian, Marc-Messier Peet, G. Lambert, D. Grace, N. Lachowsky, T. Hart, J. Cox
{"title":"从网络中抽样:受访者驱动的抽样","authors":"Mamadou Yauck, E. Moodie, Herak Apelian, Marc-Messier Peet, G. Lambert, D. Grace, N. Lachowsky, T. Hart, J. Cox","doi":"10.1515/em-2020-0033","DOIUrl":null,"url":null,"abstract":"Abstract Objectives Respondent-Driven Sampling (RDS) is a variant of link-tracing, a sampling technique for surveying hard-to-reach communities that takes advantage of community members' social networks to reach potential participants. While the RDS sampling mechanism and associated methods of adjusting for the sampling at the analysis stage are well-documented in the statistical sciences literature, methodological focus has largely been restricted to estimation of population means and proportions, while giving little to no consideration to the estimation of population network parameters. As a network-based sampling method, RDS is faced with the fundamental problem of sampling from population networks where features such as homophily (the tendency for individuals with similar traits to share social ties) and differential activity (the ratio of the average number of connections by attribute) are sensitive to the choice of a sampling method. Methods Many simple approaches exist to generate simulated RDS data, with specific levels of network features (mainly homophily and differential activity), where the focus is on estimating means and proportions (Gile 2011; Gile et al. 2015; Spiller et al. 2018). However, recent findings on the inconsistency of estimators of network features such as homophily in partially observed networks (Crawford et al. 2017; Shalizi and Rinaldo 2013) raise the question of whether those target features can be recovered using the observed RDS data alone – as recovering information about these features is critical if we wish to condition upon them. In this paper, we conduct a simulation study to assess the accuracy of existing RDS simulation methods, in terms of their abilities to generate RDS samples with the desired levels of two network parameters: homophily and differential activity. Results The results show that (1) homophily cannot be consistently estimated from simulated RDS samples and (2) differential activity estimators are more precise when groups, defined by traits, are equally active and equally represented in the population. We use this approach to mimic features of the Engage Study, an RDS sample of gay, bisexual and other men who have sex with men in Montréal, Canada. Conclusions In this paper, we highlight that it is possible, in some cases, to simulate population networks by mimicking the characteristics of real-world RDS data while retaining accuracy and precision for target network features in the samples.","PeriodicalId":37999,"journal":{"name":"Epidemiologic Methods","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2020-02-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Sampling from networks: respondent-driven sampling\",\"authors\":\"Mamadou Yauck, E. Moodie, Herak Apelian, Marc-Messier Peet, G. Lambert, D. Grace, N. Lachowsky, T. Hart, J. Cox\",\"doi\":\"10.1515/em-2020-0033\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Abstract Objectives Respondent-Driven Sampling (RDS) is a variant of link-tracing, a sampling technique for surveying hard-to-reach communities that takes advantage of community members' social networks to reach potential participants. While the RDS sampling mechanism and associated methods of adjusting for the sampling at the analysis stage are well-documented in the statistical sciences literature, methodological focus has largely been restricted to estimation of population means and proportions, while giving little to no consideration to the estimation of population network parameters. As a network-based sampling method, RDS is faced with the fundamental problem of sampling from population networks where features such as homophily (the tendency for individuals with similar traits to share social ties) and differential activity (the ratio of the average number of connections by attribute) are sensitive to the choice of a sampling method. Methods Many simple approaches exist to generate simulated RDS data, with specific levels of network features (mainly homophily and differential activity), where the focus is on estimating means and proportions (Gile 2011; Gile et al. 2015; Spiller et al. 2018). However, recent findings on the inconsistency of estimators of network features such as homophily in partially observed networks (Crawford et al. 2017; Shalizi and Rinaldo 2013) raise the question of whether those target features can be recovered using the observed RDS data alone – as recovering information about these features is critical if we wish to condition upon them. In this paper, we conduct a simulation study to assess the accuracy of existing RDS simulation methods, in terms of their abilities to generate RDS samples with the desired levels of two network parameters: homophily and differential activity. Results The results show that (1) homophily cannot be consistently estimated from simulated RDS samples and (2) differential activity estimators are more precise when groups, defined by traits, are equally active and equally represented in the population. We use this approach to mimic features of the Engage Study, an RDS sample of gay, bisexual and other men who have sex with men in Montréal, Canada. Conclusions In this paper, we highlight that it is possible, in some cases, to simulate population networks by mimicking the characteristics of real-world RDS data while retaining accuracy and precision for target network features in the samples.\",\"PeriodicalId\":37999,\"journal\":{\"name\":\"Epidemiologic Methods\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-02-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Epidemiologic Methods\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1515/em-2020-0033\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"Mathematics\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Epidemiologic Methods","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1515/em-2020-0033","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Mathematics","Score":null,"Total":0}
引用次数: 3

摘要

被调查者驱动抽样(RDS)是链接追踪的一种变体,是一种利用社区成员的社会网络来接触潜在参与者的抽样技术,用于调查难以到达的社区。虽然RDS抽样机制和在分析阶段调整抽样的相关方法在统计科学文献中有充分的记载,但方法重点在很大程度上局限于估计人口均值和比例,而很少或根本没有考虑估计人口网络参数。作为一种基于网络的抽样方法,RDS面临着从人口网络中抽样的基本问题,其中同质性(具有相似特征的个体共享社会关系的趋势)和差异活动(按属性划分的平均连接数的比率)等特征对抽样方法的选择很敏感。存在许多简单的方法来生成具有特定级别网络特征(主要是同质性和差异活动)的模拟RDS数据,其中重点是估计均值和比例(Gile 2011;Gile et al. 2015;Spiller et al. 2018)。然而,最近关于网络特征(如部分观察到的网络中的同态)估计量的不一致性的发现(Crawford et al. 2017;Shalizi和Rinaldo(2013)提出了一个问题,即是否可以单独使用观察到的RDS数据来恢复这些目标特征——因为如果我们希望以这些特征为条件,恢复有关这些特征的信息是至关重要的。在本文中,我们进行了一项模拟研究,以评估现有RDS模拟方法的准确性,根据它们生成具有两个网络参数(同质性和差分活性)所需水平的RDS样本的能力。结果表明:(1)从模拟RDS样本中无法一致地估计出同质性;(2)当由特征定义的群体在总体中具有同等的活跃度和代表性时,差分活动估计更为精确。我们使用这种方法来模仿Engage研究的特征,这是一项RDS样本,研究对象是加拿大montrsamal的同性恋、双性恋和其他与男性发生性关系的男性。在本文中,我们强调,在某些情况下,通过模仿现实世界RDS数据的特征来模拟人口网络,同时保留样本中目标网络特征的准确性和精度是可能的。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Sampling from networks: respondent-driven sampling
Abstract Objectives Respondent-Driven Sampling (RDS) is a variant of link-tracing, a sampling technique for surveying hard-to-reach communities that takes advantage of community members' social networks to reach potential participants. While the RDS sampling mechanism and associated methods of adjusting for the sampling at the analysis stage are well-documented in the statistical sciences literature, methodological focus has largely been restricted to estimation of population means and proportions, while giving little to no consideration to the estimation of population network parameters. As a network-based sampling method, RDS is faced with the fundamental problem of sampling from population networks where features such as homophily (the tendency for individuals with similar traits to share social ties) and differential activity (the ratio of the average number of connections by attribute) are sensitive to the choice of a sampling method. Methods Many simple approaches exist to generate simulated RDS data, with specific levels of network features (mainly homophily and differential activity), where the focus is on estimating means and proportions (Gile 2011; Gile et al. 2015; Spiller et al. 2018). However, recent findings on the inconsistency of estimators of network features such as homophily in partially observed networks (Crawford et al. 2017; Shalizi and Rinaldo 2013) raise the question of whether those target features can be recovered using the observed RDS data alone – as recovering information about these features is critical if we wish to condition upon them. In this paper, we conduct a simulation study to assess the accuracy of existing RDS simulation methods, in terms of their abilities to generate RDS samples with the desired levels of two network parameters: homophily and differential activity. Results The results show that (1) homophily cannot be consistently estimated from simulated RDS samples and (2) differential activity estimators are more precise when groups, defined by traits, are equally active and equally represented in the population. We use this approach to mimic features of the Engage Study, an RDS sample of gay, bisexual and other men who have sex with men in Montréal, Canada. Conclusions In this paper, we highlight that it is possible, in some cases, to simulate population networks by mimicking the characteristics of real-world RDS data while retaining accuracy and precision for target network features in the samples.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Epidemiologic Methods
Epidemiologic Methods Mathematics-Applied Mathematics
CiteScore
2.10
自引率
0.00%
发文量
7
期刊介绍: Epidemiologic Methods (EM) seeks contributions comparable to those of the leading epidemiologic journals, but also invites papers that may be more technical or of greater length than what has traditionally been allowed by journals in epidemiology. Applications and examples with real data to illustrate methodology are strongly encouraged but not required. Topics. genetic epidemiology, infectious disease, pharmaco-epidemiology, ecologic studies, environmental exposures, screening, surveillance, social networks, comparative effectiveness, statistical modeling, causal inference, measurement error, study design, meta-analysis
期刊最新文献
Linked shrinkage to improve estimation of interaction effects in regression models. Bounds for selection bias using outcome probabilities Population dynamic study of two prey one predator system with disease in first prey using fuzzy impulsive control Development and application of an evidence-based directed acyclic graph to evaluate the associations between metal mixtures and cardiometabolic outcomes. Performance evaluation of ResNet model for classification of tomato plant disease
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1