Understanding Utility and Privacy of Demographic Data in Education Technology by Causal Analysis and Adversarial-Censoring

Rakibul Hasan, Mario Fritz
{"title":"Understanding Utility and Privacy of Demographic Data in Education Technology by Causal Analysis and Adversarial-Censoring","authors":"Rakibul Hasan, Mario Fritz","doi":"10.2478/popets-2022-0044","DOIUrl":null,"url":null,"abstract":"Abstract Education technologies (EdTech) are becoming pervasive due to their cost-effectiveness, accessibility, and scalability. They also experienced accelerated market growth during the recent pandemic. EdTech collects massive amounts of students’ behavioral and (sensitive) demographic data, often justified by the potential to help students by personalizing education. Researchers voiced concerns regarding privacy and data abuses (e.g., targeted advertising) in the absence of clearly defined data collection and sharing policies. However, technical contributions to alleviating students’ privacy risks have been scarce. In this paper, we argue against collecting demographic data by showing that gender—a widely used demographic feature—does not causally affect students’ course performance: arguably the most popular target of predictive models. Then, we show that gender can be inferred from behavioral data; thus, simply leaving them out does not protect students’ privacy. Combining a feature selection mechanism with an adversarial censoring technique, we propose a novel approach to create a ‘private’ version of a dataset comprising of fewer features that predict the target without revealing the gender, and are interpretive. We conduct comprehensive experiments on a public dataset to demonstrate the robustness and generalizability of our mechanism.","PeriodicalId":74556,"journal":{"name":"Proceedings on Privacy Enhancing Technologies. Privacy Enhancing Technologies Symposium","volume":"2022 1","pages":"245 - 262"},"PeriodicalIF":0.0000,"publicationDate":"2022-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings on Privacy Enhancing Technologies. Privacy Enhancing Technologies Symposium","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2478/popets-2022-0044","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4

Abstract

Abstract Education technologies (EdTech) are becoming pervasive due to their cost-effectiveness, accessibility, and scalability. They also experienced accelerated market growth during the recent pandemic. EdTech collects massive amounts of students’ behavioral and (sensitive) demographic data, often justified by the potential to help students by personalizing education. Researchers voiced concerns regarding privacy and data abuses (e.g., targeted advertising) in the absence of clearly defined data collection and sharing policies. However, technical contributions to alleviating students’ privacy risks have been scarce. In this paper, we argue against collecting demographic data by showing that gender—a widely used demographic feature—does not causally affect students’ course performance: arguably the most popular target of predictive models. Then, we show that gender can be inferred from behavioral data; thus, simply leaving them out does not protect students’ privacy. Combining a feature selection mechanism with an adversarial censoring technique, we propose a novel approach to create a ‘private’ version of a dataset comprising of fewer features that predict the target without revealing the gender, and are interpretive. We conduct comprehensive experiments on a public dataset to demonstrate the robustness and generalizability of our mechanism.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
通过因果分析和对抗性审查了解教育技术中人口统计数据的效用和隐私性
教育技术(EdTech)由于其成本效益、可访问性和可扩展性而变得越来越普遍。在最近的大流行期间,它们的市场也加速增长。EdTech收集了大量学生的行为和(敏感的)人口统计数据,这些数据通常被认为有可能通过个性化教育来帮助学生。研究人员对缺乏明确定义的数据收集和共享政策的情况下隐私和数据滥用(例如,定向广告)表示担忧。然而,减轻学生隐私风险的技术贡献却很少。在本文中,我们反对收集人口统计数据,通过显示性别-一个广泛使用的人口统计特征-不会对学生的课程表现产生因果关系:可以说是预测模型中最受欢迎的目标。然后,我们证明了性别可以从行为数据推断出来;因此,简单地将它们排除在外并不能保护学生的隐私。将特征选择机制与对抗性审查技术相结合,我们提出了一种新的方法来创建包含较少特征的数据集的“私有”版本,这些特征可以预测目标而不透露性别,并且是解释性的。我们在公共数据集上进行了全面的实验,以证明我们的机制的鲁棒性和泛化性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
审稿时长
16 weeks
期刊最新文献
Editors' Introduction Compact and Divisible E-Cash with Threshold Issuance On the Robustness of Topics API to a Re-Identification Attack DP-SIPS: A simpler, more scalable mechanism for differentially private partition selection Privacy-Preserving Federated Recurrent Neural Networks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1