Microsimulation of an educational attainment register to study record linkage quality.

IF 1.6 Q3 HEALTH CARE SCIENCES & SERVICES International Journal of Population Data Science Pub Date : 2022-08-25 DOI:10.23889/ijpds.v7i3.1848
Maya Murmann, Douglas Manuel
{"title":"Microsimulation of an educational attainment register to study record linkage quality.","authors":"Maya Murmann, Douglas Manuel","doi":"10.23889/ijpds.v7i3.1848","DOIUrl":null,"url":null,"abstract":"Population covering educational attainment registers have been proven helpful for planning and research concerning educational efforts. Regular linking of different databases is needed to build and update such a register. Without unique national identification numbers, record linkage must be based on quasi-identifiers such as names, date of birth and sex. High-quality record linkage require the unique identification of persons. Therefore, available identifiers should be sufficient for unique identification despite missing identifiers for some cases. Redundant identifiers can achieve this goal. However, the data protection principle of data minimization, as recommended in the European General Data Protection Regulation, aims to avoid additional data if possible for the given purpose. Therefore, a ministry commissioned a simulation study to inform legislators on the minimum set of identifiers needed for a national register. A microsimulation of the population consisting of nearly 20 million people was implemented to generate data on accumulating changes and errors in identifiers over ten simulated years. The simulation covered, for example, international migration, regional mobility, marriages, school careers and mortality. Each event triggered changes of identifiers according to specified error probability models. The resulting data were linked by different record-linkage procedures. Linkage quality and linkage bias dependent on the available identifiers were assessed. We report on the design of the simulation study, the linkage results and recommendations for the minimum set of identifiers. The results may be helpful for the design of other population covering registers.","PeriodicalId":36483,"journal":{"name":"International Journal of Population Data Science","volume":" ","pages":""},"PeriodicalIF":1.6000,"publicationDate":"2022-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Population Data Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.23889/ijpds.v7i3.1848","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}
引用次数: 0

Abstract

Population covering educational attainment registers have been proven helpful for planning and research concerning educational efforts. Regular linking of different databases is needed to build and update such a register. Without unique national identification numbers, record linkage must be based on quasi-identifiers such as names, date of birth and sex. High-quality record linkage require the unique identification of persons. Therefore, available identifiers should be sufficient for unique identification despite missing identifiers for some cases. Redundant identifiers can achieve this goal. However, the data protection principle of data minimization, as recommended in the European General Data Protection Regulation, aims to avoid additional data if possible for the given purpose. Therefore, a ministry commissioned a simulation study to inform legislators on the minimum set of identifiers needed for a national register. A microsimulation of the population consisting of nearly 20 million people was implemented to generate data on accumulating changes and errors in identifiers over ten simulated years. The simulation covered, for example, international migration, regional mobility, marriages, school careers and mortality. Each event triggered changes of identifiers according to specified error probability models. The resulting data were linked by different record-linkage procedures. Linkage quality and linkage bias dependent on the available identifiers were assessed. We report on the design of the simulation study, the linkage results and recommendations for the minimum set of identifiers. The results may be helpful for the design of other population covering registers.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
教育程度登记册的微观模拟,以研究记录的联系质量。
事实证明,涵盖受教育程度登记册的人口有助于有关教育努力的规划和研究。建立和更新这样一个登记册需要定期连接不同的数据库。如果没有唯一的国民识别号码,记录链接必须基于姓名、出生日期和性别等准标识符。高质量的记录联动需要人员的唯一标识。因此,尽管在某些情况下缺少标识符,可用的标识符应该足以用于惟一标识。冗余标识符可以实现这一目标。然而,数据最小化的数据保护原则,正如欧洲通用数据保护条例所建议的那样,旨在尽可能避免为给定目的提供额外数据。因此,一个部门委托进行了一项模拟研究,以告知立法者国家登记册所需的最低标识符集。对近2000万人的人口进行了微观模拟,以生成10年模拟期间标识符累积变化和错误的数据。例如,模拟涵盖了国际移徙、区域流动、婚姻、学业和死亡率。每个事件根据指定的错误概率模型触发标识符的更改。结果数据通过不同的记录链接程序链接。评估了依赖于可用标识符的连锁质量和连锁偏差。我们报告了模拟研究的设计,链接结果和最小标识符集的建议。研究结果可为其他人口覆盖登记的设计提供参考。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
2.50
自引率
0.00%
发文量
386
审稿时长
20 weeks
期刊最新文献
Defining a low-risk birth cohort: a cohort study comparing two perinatal data sets in Ontario, Canada. Data resource profile: nutrition data in the VA million veteran program. Deprivation effects on length of stay and death of hospitalised COVID-19 patients in Greater Manchester. Variation in colorectal cancer treatment and outcomes in Scotland: real world evidence from national linked administrative health data. Examining the quality and population representativeness of linked survey and administrative data: guidance and illustration using linked 1958 National Child Development Study and Hospital Episode Statistics data
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1