首页 > 最新文献

Privacy in statistical databases. PSD (Conference : 2004- )最新文献

英文 中文
Comparing the Utility and Disclosure Risk of Synthetic Data with Samples of Microdata 综合数据与微数据样本的效用与披露风险比较
Pub Date : 2022-07-02 DOI: 10.48550/arXiv.2207.03339
C. Little, M. Elliot, R. Allmendinger
Most statistical agencies release randomly selected samples of Census microdata, usually with sample fractions under 10% and with other forms of statistical disclosure control (SDC) applied. An alternative to SDC is data synthesis, which has been attracting growing interest, yet there is no clear consensus on how to measure the associated utility and disclosure risk of the data. The ability to produce synthetic Census microdata, where the utility and associated risks are clearly understood, could mean that more timely and wider-ranging access to microdata would be possible. This paper follows on from previous work by the authors which mapped synthetic Census data on a risk-utility (R-U) map. The paper presents a framework to measure the utility and disclosure risk of synthetic data by comparing it to samples of the original data of varying sample fractions, thereby identifying the sample fraction which has equivalent utility and risk to the synthetic data. Three commonly used data synthesis packages are compared with some interesting results. Further work is needed in several directions but the methodology looks very promising.
大多数统计机构发布随机抽取的普查微数据样本,样本比例通常在10%以下,并采用其他形式的统计披露控制(SDC)。SDC的另一种替代方案是数据综合,这引起了越来越多的兴趣,但在如何衡量数据的相关效用和披露风险方面尚无明确的共识。生成综合普查微数据的能力,其中的效用和相关风险是清楚了解的,这可能意味着更及时和更广泛地获取微数据是可能的。本文继承了前人在风险效用(R-U)图上绘制人口普查综合数据的工作。本文提出了一个框架,通过将合成数据与不同样本分数的原始数据的样本进行比较,来衡量合成数据的效用和披露风险,从而识别出与合成数据具有同等效用和风险的样本分数。比较了三种常用的数据合成包,得到了一些有趣的结果。在几个方向上还需要进一步的工作,但这种方法看起来很有前途。
{"title":"Comparing the Utility and Disclosure Risk of Synthetic Data with Samples of Microdata","authors":"C. Little, M. Elliot, R. Allmendinger","doi":"10.48550/arXiv.2207.03339","DOIUrl":"https://doi.org/10.48550/arXiv.2207.03339","url":null,"abstract":"Most statistical agencies release randomly selected samples of Census microdata, usually with sample fractions under 10% and with other forms of statistical disclosure control (SDC) applied. An alternative to SDC is data synthesis, which has been attracting growing interest, yet there is no clear consensus on how to measure the associated utility and disclosure risk of the data. The ability to produce synthetic Census microdata, where the utility and associated risks are clearly understood, could mean that more timely and wider-ranging access to microdata would be possible. This paper follows on from previous work by the authors which mapped synthetic Census data on a risk-utility (R-U) map. The paper presents a framework to measure the utility and disclosure risk of synthetic data by comparing it to samples of the original data of varying sample fractions, thereby identifying the sample fraction which has equivalent utility and risk to the synthetic data. Three commonly used data synthesis packages are compared with some interesting results. Further work is needed in several directions but the methodology looks very promising.","PeriodicalId":91946,"journal":{"name":"Privacy in statistical databases. PSD (Conference : 2004- )","volume":"117 1","pages":"234-249"},"PeriodicalIF":0.0,"publicationDate":"2022-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75755999","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Utility and Disclosure Risk for Differentially Private Synthetic Categorical Data 差异私有合成分类数据的效用与披露风险
Pub Date : 2022-06-03 DOI: 10.1007/978-3-031-13945-1_18
G. Raab
{"title":"Utility and Disclosure Risk for Differentially Private Synthetic Categorical Data","authors":"G. Raab","doi":"10.1007/978-3-031-13945-1_18","DOIUrl":"https://doi.org/10.1007/978-3-031-13945-1_18","url":null,"abstract":"","PeriodicalId":91946,"journal":{"name":"Privacy in statistical databases. PSD (Conference : 2004- )","volume":"73-74 1","pages":"250-265"},"PeriodicalIF":0.0,"publicationDate":"2022-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89058084","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
On Integrating the Number of Synthetic Data Sets m into the a priori Synthesis Approach 先验综合方法中综合数据集个数m的研究
Pub Date : 2022-05-12 DOI: 10.1007/978-3-031-13945-1_15
James Jackson, R. Mitra, Brian Francis, Iain Dove
{"title":"On Integrating the Number of Synthetic Data Sets m into the a priori Synthesis Approach","authors":"James Jackson, R. Mitra, Brian Francis, Iain Dove","doi":"10.1007/978-3-031-13945-1_15","DOIUrl":"https://doi.org/10.1007/978-3-031-13945-1_15","url":null,"abstract":"","PeriodicalId":91946,"journal":{"name":"Privacy in statistical databases. PSD (Conference : 2004- )","volume":"99 1","pages":"205-219"},"PeriodicalIF":0.0,"publicationDate":"2022-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73213257","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
A Re-examination of the Census Bureau Reconstruction and Reidentification Attack 重新审视人口普查局重建和重新识别攻击
Pub Date : 2022-05-08 DOI: 10.48550/arXiv.2205.03939
K. Muralidhar
: Recent analysis by researchers at the U.S. Census Bureau claims that by reconstructing the tabular data released from the 2010 Census, it is possible to reconstruct the original data and, using an accurate external data file with identity, reidentify 179 million respondents (approximately 58% of the population). This study shows that there are a practically infinite number of possible reconstructions, and each reconstruction leads to assigning a different identity to the respondents in the reconstructed data. The results reported by the Census Bureau researchers are based on just one of these infinite possible reconstructions and is easily refuted by an alternate reconstruction. Without definitive proof that the reconstruction is unique, or at the very least, that most reconstructions lead to the assignment of the same identity to the same respondent, claims of confirmed reidentification are highly suspect and easily refuted. The Census releases data at different geographic levels: nation, state, county, tract, block group, and block. The final three are census-defined constructs and do not necessarily correspond to traditional geographic classification. For personal level data, the data at the smaller geographic level is aggregated to the next higher level, that is, the results at the block level are aggregated to block groups, block groups are aggregated to tracts, etc. The multiple tables that are released (Total Population, Sex by Age, Total Races, and others) are all aggregations of the most detailed data release (Age by Sex, by Race, by Ethnicity). The different tables released form the basis of the reconstruction of the respondent microdata.
美国人口普查局的研究人员最近分析称,通过重建2010年人口普查发布的表格数据,有可能重建原始数据,并使用具有身份的准确外部数据文件,重新识别1.79亿受访者(约占人口的58%)。本研究表明,重构的可能性几乎是无限的,每一次重构都会给重构数据中的被调查者赋予不同的身份。人口普查局研究人员报告的结果只是基于这些无限可能的重建中的一种,很容易被另一种重建所反驳。如果没有明确的证据证明重建是独一无二的,或者至少,大多数重建导致将同一身份分配给同一被告,则证实重新身份的说法是高度可疑的,很容易被驳斥。人口普查按不同的地理层次发布数据:国家、州、县、地区、街区和街区。最后三个是人口普查定义的结构,不一定符合传统的地理分类。对于个人层面的数据,将较小地理层面的数据聚合到更高的层面,即将块层面的结果聚合到块组,块组聚合到域等。发布的多个表(总人口、按年龄性别、总种族和其他)都是最详细的数据发布(按性别年龄、按种族、按民族)的汇总。发布的不同表格构成了被调查者微数据重建的基础。
{"title":"A Re-examination of the Census Bureau Reconstruction and Reidentification Attack","authors":"K. Muralidhar","doi":"10.48550/arXiv.2205.03939","DOIUrl":"https://doi.org/10.48550/arXiv.2205.03939","url":null,"abstract":": Recent analysis by researchers at the U.S. Census Bureau claims that by reconstructing the tabular data released from the 2010 Census, it is possible to reconstruct the original data and, using an accurate external data file with identity, reidentify 179 million respondents (approximately 58% of the population). This study shows that there are a practically infinite number of possible reconstructions, and each reconstruction leads to assigning a different identity to the respondents in the reconstructed data. The results reported by the Census Bureau researchers are based on just one of these infinite possible reconstructions and is easily refuted by an alternate reconstruction. Without definitive proof that the reconstruction is unique, or at the very least, that most reconstructions lead to the assignment of the same identity to the same respondent, claims of confirmed reidentification are highly suspect and easily refuted. The Census releases data at different geographic levels: nation, state, county, tract, block group, and block. The final three are census-defined constructs and do not necessarily correspond to traditional geographic classification. For personal level data, the data at the smaller geographic level is aggregated to the next higher level, that is, the results at the block level are aggregated to block groups, block groups are aggregated to tracts, etc. The multiple tables that are released (Total Population, Sex by Age, Total Races, and others) are all aggregations of the most detailed data release (Age by Sex, by Race, by Ethnicity). The different tables released form the basis of the reconstruction of the respondent microdata.","PeriodicalId":91946,"journal":{"name":"Privacy in statistical databases. PSD (Conference : 2004- )","volume":"39 1","pages":"312-323"},"PeriodicalIF":0.0,"publicationDate":"2022-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82611151","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
A Note on the Misinterpretation of the US Census Re-identification Attack 关于对美国人口普查重新识别攻击的误解的说明
Pub Date : 2022-02-10 DOI: 10.1007/978-3-031-13945-1_21
Paul L. Francis
{"title":"A Note on the Misinterpretation of the US Census Re-identification Attack","authors":"Paul L. Francis","doi":"10.1007/978-3-031-13945-1_21","DOIUrl":"https://doi.org/10.1007/978-3-031-13945-1_21","url":null,"abstract":"","PeriodicalId":91946,"journal":{"name":"Privacy in statistical databases. PSD (Conference : 2004- )","volume":"14 1","pages":"299-311"},"PeriodicalIF":0.0,"publicationDate":"2022-02-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89668329","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Perspectives for Tabular Data Protection - How About Synthetic Data? 表格数据保护的视角——合成数据如何?
Pub Date : 2022-01-01 DOI: 10.1007/978-3-031-13945-1_6
F. Geyer, R. Tent, Michel Reiffert, Sarah Giessing
{"title":"Perspectives for Tabular Data Protection - How About Synthetic Data?","authors":"F. Geyer, R. Tent, Michel Reiffert, Sarah Giessing","doi":"10.1007/978-3-031-13945-1_6","DOIUrl":"https://doi.org/10.1007/978-3-031-13945-1_6","url":null,"abstract":"","PeriodicalId":91946,"journal":{"name":"Privacy in statistical databases. PSD (Conference : 2004- )","volume":"47 1","pages":"77-91"},"PeriodicalIF":0.0,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81286721","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Tit-for-Tat Disclosure of a Binding Sequence of User Analyses in Safe Data Access Centers 安全数据访问中心中用户分析绑定序列的针锋相对的披露
Pub Date : 2022-01-01 DOI: 10.1007/978-3-031-13945-1_10
J. Domingo-Ferrer
{"title":"Tit-for-Tat Disclosure of a Binding Sequence of User Analyses in Safe Data Access Centers","authors":"J. Domingo-Ferrer","doi":"10.1007/978-3-031-13945-1_10","DOIUrl":"https://doi.org/10.1007/978-3-031-13945-1_10","url":null,"abstract":"","PeriodicalId":91946,"journal":{"name":"Privacy in statistical databases. PSD (Conference : 2004- )","volume":"477 1","pages":"133-141"},"PeriodicalIF":0.0,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86752652","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Membership Inference Attack Against Principal Component Analysis 针对主成分分析的隶属推理攻击
Pub Date : 2022-01-01 DOI: 10.1007/978-3-031-13945-1_19
Oualid Zari, Javier Parra-Arnau, Ayşe Ünsal, T. Strufe, Melek Önen
{"title":"Membership Inference Attack Against Principal Component Analysis","authors":"Oualid Zari, Javier Parra-Arnau, Ayşe Ünsal, T. Strufe, Melek Önen","doi":"10.1007/978-3-031-13945-1_19","DOIUrl":"https://doi.org/10.1007/978-3-031-13945-1_19","url":null,"abstract":"","PeriodicalId":91946,"journal":{"name":"Privacy in statistical databases. PSD (Conference : 2004- )","volume":"109 1","pages":"269-282"},"PeriodicalIF":0.0,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75703547","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
On Privacy of Multidimensional Data Against Aggregate Knowledge Attacks 针对聚合知识攻击的多维数据隐私研究
Pub Date : 2022-01-01 DOI: 10.1007/978-3-031-13945-1_7
Ala Eddine Laouir, Abdessamad Imine
{"title":"On Privacy of Multidimensional Data Against Aggregate Knowledge Attacks","authors":"Ala Eddine Laouir, Abdessamad Imine","doi":"10.1007/978-3-031-13945-1_7","DOIUrl":"https://doi.org/10.1007/978-3-031-13945-1_7","url":null,"abstract":"","PeriodicalId":91946,"journal":{"name":"Privacy in statistical databases. PSD (Conference : 2004- )","volume":"12 1","pages":"92-104"},"PeriodicalIF":0.0,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89656287","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Privacy in Statistical Databases: International Conference, PSD 2022, Paris, France, September 21–23, 2022, Proceedings 统计数据库中的隐私:国际会议,PSD 2022,巴黎,法国,2022年9月21-23日,论文集
Pub Date : 2022-01-01 DOI: 10.1007/978-3-031-13945-1
{"title":"Privacy in Statistical Databases: International Conference, PSD 2022, Paris, France, September 21–23, 2022, Proceedings","authors":"","doi":"10.1007/978-3-031-13945-1","DOIUrl":"https://doi.org/10.1007/978-3-031-13945-1","url":null,"abstract":"","PeriodicalId":91946,"journal":{"name":"Privacy in statistical databases. PSD (Conference : 2004- )","volume":"29 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87929303","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Privacy in statistical databases. PSD (Conference : 2004- )
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1