Treatment of sample under-representation and skewed heavy-tailed distributions in survey-based microsimulation: An analysis of redistribution effects in compulsory health care insurance in Switzerland

Tobias Schoch, André Müller
{"title":"Treatment of sample under-representation and skewed heavy-tailed distributions in survey-based microsimulation: An analysis of redistribution effects in compulsory health care insurance in Switzerland","authors":"Tobias Schoch,&nbsp;André Müller","doi":"10.1007/s11943-020-00275-8","DOIUrl":null,"url":null,"abstract":"<div><p> The credibility of microsimulation modeling with the research community and policymakers depends on high-quality baseline surveys. Quality problems with the baseline survey tend to impair the quality of microsimulation built on top of the survey data. We address two potential issues that both relate to skewed and heavy-tailed distributions.</p><p>First, we find that ultra-high-income households are under-represented in the baseline household survey. Moreover, the sample estimate of average income underestimates the known population average. Although the Deville–Särndal calibration method corrects the under-representation, it cannot achieve alignment of estimated average income in the right tail of the distribution with known population values without distorting the empirical income distribution. To overcome the problem, we introduce a Pareto tail model. With the help of the tail model, we can adjust the sample income distribution in the tail to meet the alignment targets. Our method can be a useful tool for microsimulation modelers working with survey income data.</p><p>The second contribution refers to the treatment of an outlier-prone variable that has been added to the survey by record linkage (our empirical example is health care cost). The nature of the baseline survey is not affected by record linkage, that is, the baseline survey still covers only a small part of the population. Hence, the sampling weights are relatively large. An outlying observation together with a high sampling weight can heavily influence or even ruin an estimate of a population characteristic. Thus, we argue that it is beneficial—in terms of mean square error—to use robust estimation and alignment methods, because robust methods are less affected by the presence of outliers.</p></div>","PeriodicalId":100134,"journal":{"name":"AStA Wirtschafts- und Sozialstatistisches Archiv","volume":"14 3-4","pages":"267 - 304"},"PeriodicalIF":0.0000,"publicationDate":"2020-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1007/s11943-020-00275-8","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"AStA Wirtschafts- und Sozialstatistisches Archiv","FirstCategoryId":"1085","ListUrlMain":"https://link.springer.com/article/10.1007/s11943-020-00275-8","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

The credibility of microsimulation modeling with the research community and policymakers depends on high-quality baseline surveys. Quality problems with the baseline survey tend to impair the quality of microsimulation built on top of the survey data. We address two potential issues that both relate to skewed and heavy-tailed distributions.

First, we find that ultra-high-income households are under-represented in the baseline household survey. Moreover, the sample estimate of average income underestimates the known population average. Although the Deville–Särndal calibration method corrects the under-representation, it cannot achieve alignment of estimated average income in the right tail of the distribution with known population values without distorting the empirical income distribution. To overcome the problem, we introduce a Pareto tail model. With the help of the tail model, we can adjust the sample income distribution in the tail to meet the alignment targets. Our method can be a useful tool for microsimulation modelers working with survey income data.

The second contribution refers to the treatment of an outlier-prone variable that has been added to the survey by record linkage (our empirical example is health care cost). The nature of the baseline survey is not affected by record linkage, that is, the baseline survey still covers only a small part of the population. Hence, the sampling weights are relatively large. An outlying observation together with a high sampling weight can heavily influence or even ruin an estimate of a population characteristic. Thus, we argue that it is beneficial—in terms of mean square error—to use robust estimation and alignment methods, because robust methods are less affected by the presence of outliers.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于调查的微观模拟中样本代表性不足和偏态重尾分布的处理:瑞士强制性医疗保险中再分配效应的分析
微观模拟模型在研究界和决策者中的可信度取决于高质量的基线调查。基线调查的质量问题往往会损害建立在调查数据之上的微观模拟的质量。我们讨论了两个潜在的问题,这两个问题都与偏斜和重尾分布有关。首先,我们发现超高收入家庭在基线家庭调查中的代表性不足。此外,对平均收入的抽样估计低估了已知的人口平均数。尽管Deville–Särndal校准方法纠正了表示不足的情况,但在不扭曲经验收入分布的情况下,它无法实现分布右尾的估计平均收入与已知人口值的一致性。为了克服这个问题,我们引入了一个Pareto尾部模型。借助尾部模型,我们可以调整尾部的样本收入分布,以满足对齐目标。我们的方法对于处理调查收入数据的微观模拟建模人员来说是一个有用的工具。第二个贡献是指对通过记录链接添加到调查中的异常值倾向变量的处理(我们的经验例子是医疗保健成本)。基线调查的性质不受记录联系的影响,即基线调查仍然只覆盖一小部分人口。因此,采样权重相对较大。一个孤立的观察加上高采样权重可能会严重影响甚至破坏对种群特征的估计。因此,我们认为,就均方误差而言,使用稳健的估计和对齐方法是有益的,因为稳健的方法较少受到异常值的影响。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Vorwort der Herausgeber Connecting algorithmic fairness to quality dimensions in machine learning in official statistics and survey production Automated Bayesian variable selection methods for binary regression models with missing covariate data Fairness als Qualitätskriterium im Maschinellen Lernen – Rekonstruktion des philosophischen Konzepts und Implikationen für die Nutzung außergesetzlicher Merkmale bei qualifizierten Mietspiegeln Interview mit Walter Krämer
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1