我们需要多少用户背景?隐私在心理健康NLP应用中的设计

Proceedings of the International AAAI Conference on Web and Social Media Pub Date : 2023-06-02 DOI:10.1609/icwsm.v17i1.22186

Ramit Sawhney, Atula Neerkaje, Ivan Habernal, Lucie Flek

{"title":"我们需要多少用户背景?隐私在心理健康NLP应用中的设计","authors":"Ramit Sawhney, Atula Neerkaje, Ivan Habernal, Lucie Flek","doi":"10.1609/icwsm.v17i1.22186","DOIUrl":null,"url":null,"abstract":"Clinical NLP tasks such as mental health assessment from text, must take social constraints into account - the performance maximization must be constrained by the utmost importance of guaranteeing privacy of user data. Consumer protection regulations, such as GDPR, generally handle privacy by restricting data availability, such as requiring to limit user data to 'what is necessary' for a given purpose. In this work, we reason that providing stricter formal privacy guarantees, while increasing the volume of user data in the model, in most cases increases benefit for all parties involved, especially for the user. We demonstrate our arguments on two existing suicide risk assessment datasets of Twitter and Reddit posts. We present the first analysis juxtaposing user history length and differential privacy budgets and elaborate how modeling additional user context enables utility preservation while maintaining acceptable user privacy guarantees.","PeriodicalId":338112,"journal":{"name":"Proceedings of the International AAAI Conference on Web and Social Media","volume":"12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"How Much User Context Do We Need? Privacy by Design in Mental Health NLP Applications\",\"authors\":\"Ramit Sawhney, Atula Neerkaje, Ivan Habernal, Lucie Flek\",\"doi\":\"10.1609/icwsm.v17i1.22186\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Clinical NLP tasks such as mental health assessment from text, must take social constraints into account - the performance maximization must be constrained by the utmost importance of guaranteeing privacy of user data. Consumer protection regulations, such as GDPR, generally handle privacy by restricting data availability, such as requiring to limit user data to 'what is necessary' for a given purpose. In this work, we reason that providing stricter formal privacy guarantees, while increasing the volume of user data in the model, in most cases increases benefit for all parties involved, especially for the user. We demonstrate our arguments on two existing suicide risk assessment datasets of Twitter and Reddit posts. We present the first analysis juxtaposing user history length and differential privacy budgets and elaborate how modeling additional user context enables utility preservation while maintaining acceptable user privacy guarantees.\",\"PeriodicalId\":338112,\"journal\":{\"name\":\"Proceedings of the International AAAI Conference on Web and Social Media\",\"volume\":\"12 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-06-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the International AAAI Conference on Web and Social Media\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1609/icwsm.v17i1.22186\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the International AAAI Conference on Web and Social Media","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1609/icwsm.v17i1.22186","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

临床NLP任务，如从文本进行心理健康评估，必须考虑到社会约束——性能最大化必须受到保证用户数据隐私的最重要的约束。消费者保护条例，如GDPR，通常通过限制数据可用性来处理隐私，例如要求将用户数据限制在特定目的的“必要”范围内。在这项工作中，我们认为，在增加模型中用户数据量的同时，提供更严格的正式隐私保证，在大多数情况下会增加各方的利益，尤其是用户。我们在两个现有的Twitter和Reddit帖子的自杀风险评估数据集上展示了我们的论点。我们提出了第一个分析，并列了用户历史长度和不同的隐私预算，并详细说明了如何建模额外的用户上下文来实现效用保存，同时保持可接受的用户隐私保证。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

How Much User Context Do We Need? Privacy by Design in Mental Health NLP Applications

Clinical NLP tasks such as mental health assessment from text, must take social constraints into account - the performance maximization must be constrained by the utmost importance of guaranteeing privacy of user data. Consumer protection regulations, such as GDPR, generally handle privacy by restricting data availability, such as requiring to limit user data to 'what is necessary' for a given purpose. In this work, we reason that providing stricter formal privacy guarantees, while increasing the volume of user data in the model, in most cases increases benefit for all parties involved, especially for the user. We demonstrate our arguments on two existing suicide risk assessment datasets of Twitter and Reddit posts. We present the first analysis juxtaposing user history length and differential privacy budgets and elaborate how modeling additional user context enables utility preservation while maintaining acceptable user privacy guarantees.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the International AAAI Conference on Web and Social Media

自引率

0.00%

发文量