Investigating the ecological fallacy through sampling distributions constructed from finite populations

IF 0.8 Q3 STATISTICS & PROBABILITY Monte Carlo Methods and Applications Pub Date : 2024-08-08 DOI:10.1515/mcma-2024-2013
David J. Torres, Damain Rouson
{"title":"Investigating the ecological fallacy through sampling distributions constructed from finite populations","authors":"David J. Torres, Damain Rouson","doi":"10.1515/mcma-2024-2013","DOIUrl":null,"url":null,"abstract":"\n Correlation coefficients\nand linear regression values computed from group averages can differ from correlation coefficients and linear regression values computed using individual scores. This observation known as the ecological fallacy often assumes that all the individual scores are available from a population. In many situations, one must use a sample from the larger population. In such cases, the computed correlation coefficient and linear regression values will depend on the sample that is chosen and the underlying sampling distribution.\nThe sampling distribution of correlation coefficients and linear regression values for group averages will be identical to the sampling distribution for individuals for normally distributed variables for random samples drawn from infinitely large continuous distributions.\nHowever, data that is acquired in practice is often acquired when sampling without replacement from a finite population. Our objective is to demonstrate through Monte Carlo simulations that the\nsampling distributions for\ncorrelation and linear regression will also be similar for individuals and group averages when sampling without replacement from normally distributed variables. These simulations suggest that when a random sample from a population is selected, the correlation coefficients and linear regression values computed from individual scores will not be more accurate in estimating the entire population values compared to samples when group averages are used as long as the sample size is the same.","PeriodicalId":46576,"journal":{"name":"Monte Carlo Methods and Applications","volume":null,"pages":null},"PeriodicalIF":0.8000,"publicationDate":"2024-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Monte Carlo Methods and Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1515/mcma-2024-2013","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"STATISTICS & PROBABILITY","Score":null,"Total":0}
引用次数: 0

Abstract

Correlation coefficients and linear regression values computed from group averages can differ from correlation coefficients and linear regression values computed using individual scores. This observation known as the ecological fallacy often assumes that all the individual scores are available from a population. In many situations, one must use a sample from the larger population. In such cases, the computed correlation coefficient and linear regression values will depend on the sample that is chosen and the underlying sampling distribution. The sampling distribution of correlation coefficients and linear regression values for group averages will be identical to the sampling distribution for individuals for normally distributed variables for random samples drawn from infinitely large continuous distributions. However, data that is acquired in practice is often acquired when sampling without replacement from a finite population. Our objective is to demonstrate through Monte Carlo simulations that the sampling distributions for correlation and linear regression will also be similar for individuals and group averages when sampling without replacement from normally distributed variables. These simulations suggest that when a random sample from a population is selected, the correlation coefficients and linear regression values computed from individual scores will not be more accurate in estimating the entire population values compared to samples when group averages are used as long as the sample size is the same.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
通过有限种群构建的抽样分布研究生态谬误
根据群体平均值计算的相关系数和线性回归值可能与根据个体得分计算的相关系数和线性回归值不同。这种被称为 "生态谬误 "的观点通常假定可以从群体中获得所有的个体分数。在很多情况下,我们必须从更大的群体中抽取样本。在这种情况下,计算出的相关系数和线性回归值将取决于所选择的样本和基本的抽样分布。对于从无限大连续分布中抽取的随机样本,群体平均值的相关系数和线性回归值的抽样分布将与正态分布变量的个体抽样分布相同。我们的目的是通过蒙特卡罗模拟证明,从正态分布变量中进行不替换抽样时,个体和群体平均值的相关性和线性回归的抽样分布也是相似的。这些模拟结果表明,当从群体中随机抽样时,只要样本量相同,根据个体得分计算出的相关系数和线性回归值与使用群体平均值的样本相比,在估计整个群体的数值时不会更准确。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Monte Carlo Methods and Applications
Monte Carlo Methods and Applications STATISTICS & PROBABILITY-
CiteScore
1.20
自引率
22.20%
发文量
31
期刊最新文献
Investigating the ecological fallacy through sampling distributions constructed from finite populations Joint application of the Monte Carlo method and computational probabilistic analysis in problems of numerical modeling with data uncertainties Choice of a constant in the expression for the error of the Monte Carlo method Estimation in shape mixtures of skew-normal linear regression models via ECM coupled with Gibbs sampling A gradient method for high-dimensional BSDEs
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1