Scaling health analytics to millions without compromising privacy using deep distributed behavior models

Proceedings of the 11th EAI International Conference on Pervasive Computing Technologies for Healthcare Pub Date : 2017-05-23 DOI:10.1145/3154862.3154873

Petar Velickovic, N. Lane, S. Bhattacharya, A. Chieh, O. Bellahsen, M. Vegreville

{"title":"Scaling health analytics to millions without compromising privacy using deep distributed behavior models","authors":"Petar Velickovic, N. Lane, S. Bhattacharya, A. Chieh, O. Bellahsen, M. Vegreville","doi":"10.1145/3154862.3154873","DOIUrl":null,"url":null,"abstract":"People are naturally sensitive to the sharing of their health data collected by various connected consumer devices (e.g., smart scales, sleep trackers) with third parties. However, sharing this data to compute aggregate statistics and comparisons is a basic building block for a range of medical studies based on large-scale consumer devices; such studies have the potential to transform how we study disease and behavior. Furthermore, informing users as to how their health measurements and activities compare with friends, demographic peers and globally has been shown to be a powerful tool for behavior change and management in individuals. While experienced organizations can safely perform aggregate user health analysis, there is a significant need for new privacy-preserving mechanisms that enable people to engage in the same way even with untrusted third parties (e.g., small/recently established organizations). In this work, we propose a new approach to this problem grounded in the use of deep distributed behavior models. These are discriminative deep learning models that can approximate the calculation of various aggregate functions. Models are bootstrapped with training data from a modestly sized cohort and then distributed directly to personal devices to estimate, for example, how the user (perhaps in terms of daily step counts) ranks/compares to various demographics ranges (like age and sex). Critically, the user's own data now never has to leave the device. We validate this method using a 1.2M-user 22-month dataset that spans body-weight, sleep hours and step counts collected by devices from Nokia Digital Health - Withings. Experiments show our framework remains accurate for a range of commonly used statistical aggregate functions. This result opens a powerful new paradigm for privacy-preserving analytics under which user data largely remains on personal devices, overcoming a variety of potential privacy risks.","PeriodicalId":200810,"journal":{"name":"Proceedings of the 11th EAI International Conference on Pervasive Computing Technologies for Healthcare","volume":"58 4","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 11th EAI International Conference on Pervasive Computing Technologies for Healthcare","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3154862.3154873","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 5

Abstract

People are naturally sensitive to the sharing of their health data collected by various connected consumer devices (e.g., smart scales, sleep trackers) with third parties. However, sharing this data to compute aggregate statistics and comparisons is a basic building block for a range of medical studies based on large-scale consumer devices; such studies have the potential to transform how we study disease and behavior. Furthermore, informing users as to how their health measurements and activities compare with friends, demographic peers and globally has been shown to be a powerful tool for behavior change and management in individuals. While experienced organizations can safely perform aggregate user health analysis, there is a significant need for new privacy-preserving mechanisms that enable people to engage in the same way even with untrusted third parties (e.g., small/recently established organizations). In this work, we propose a new approach to this problem grounded in the use of deep distributed behavior models. These are discriminative deep learning models that can approximate the calculation of various aggregate functions. Models are bootstrapped with training data from a modestly sized cohort and then distributed directly to personal devices to estimate, for example, how the user (perhaps in terms of daily step counts) ranks/compares to various demographics ranges (like age and sex). Critically, the user's own data now never has to leave the device. We validate this method using a 1.2M-user 22-month dataset that spans body-weight, sleep hours and step counts collected by devices from Nokia Digital Health - Withings. Experiments show our framework remains accurate for a range of commonly used statistical aggregate functions. This result opens a powerful new paradigm for privacy-preserving analytics under which user data largely remains on personal devices, overcoming a variety of potential privacy risks.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

使用深度分布式行为模型，在不损害隐私的情况下，将健康分析扩展到数百万人

人们对与第三方共享各种连接的消费设备(如智能秤、睡眠追踪器)收集的健康数据自然很敏感。然而，共享这些数据以计算汇总统计和比较是基于大规模消费设备的一系列医学研究的基本组成部分;这样的研究有可能改变我们研究疾病和行为的方式。此外，告知用户他们的健康测量值和活动与朋友、人口同行和全球相比如何，已被证明是个人行为改变和管理的有力工具。虽然有经验的组织可以安全地执行汇总用户健康分析，但迫切需要新的隐私保护机制，使人们能够以同样的方式参与，即使是与不受信任的第三方(例如，小型/新成立的组织)。在这项工作中，我们提出了一种基于深度分布式行为模型的新方法来解决这个问题。这些是判别深度学习模型，可以近似计算各种聚合函数。模型由来自中等规模队列的训练数据引导，然后直接分发到个人设备上，以估计用户(可能是根据每日步数)与各种人口统计范围(如年龄和性别)的排名/比较。关键的是，用户自己的数据现在再也不用离开设备了。我们使用120万用户22个月的数据集验证了这一方法，该数据集涵盖了诺基亚数字健康- Withings设备收集的体重、睡眠时间和步数。实验表明，我们的框架对于一系列常用的统计聚合函数仍然是准确的。这一结果为隐私保护分析开辟了一个强大的新范式，在这个范式下，用户数据大部分保留在个人设备上，克服了各种潜在的隐私风险。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Proceedings of the 11th EAI International Conference on Pervasive Computing Technologies for Healthcare

自引率

0.00%

发文量