Anand Jerry George;Lekshmi Ramesh;Aditya Vikram Singh;Himanshu Tyagi
{"title":"用户级隐私下的连续平均值估计","authors":"Anand Jerry George;Lekshmi Ramesh;Aditya Vikram Singh;Himanshu Tyagi","doi":"10.1109/JSAIT.2024.3366086","DOIUrl":null,"url":null,"abstract":"We consider the problem of continually releasing an estimate of the population mean of a stream of samples that is user-level differentially private (DP). At each time instant, a user contributes a sample, and the users can arrive in arbitrary order. Until now these requirements of continual release and user-level privacy were considered in isolation. But, in practice, both these requirements come together as the users often contribute data repeatedly and multiple queries are made. We provide an algorithm that outputs a mean estimate at every time instant t such that the overall release is user-level \n<inline-formula> <tex-math>$\\varepsilon $ </tex-math></inline-formula>\n-DP and has the following error guarantee: Denoting by \n<inline-formula> <tex-math>$m_{t}$ </tex-math></inline-formula>\n the maximum number of samples contributed by a user, as long as \n<inline-formula> <tex-math>$\\tilde {\\Omega }(1/\\varepsilon)$ </tex-math></inline-formula>\n users have \n<inline-formula> <tex-math>$m_{t}/2$ </tex-math></inline-formula>\n samples each, the error at time t is \n<inline-formula> <tex-math>$\\tilde {O}(1/\\sqrt {t}+\\sqrt {m}_{t}/t\\varepsilon)$ </tex-math></inline-formula>\n. This is a universal error guarantee which is valid for all arrival patterns of the users. Furthermore, it (almost) matches the existing lower bounds for the single-release setting at all time instants when users have contributed equal number of samples.","PeriodicalId":73295,"journal":{"name":"IEEE journal on selected areas in information theory","volume":"5 ","pages":"28-43"},"PeriodicalIF":0.0000,"publicationDate":"2024-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Continual Mean Estimation Under User-Level Privacy\",\"authors\":\"Anand Jerry George;Lekshmi Ramesh;Aditya Vikram Singh;Himanshu Tyagi\",\"doi\":\"10.1109/JSAIT.2024.3366086\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We consider the problem of continually releasing an estimate of the population mean of a stream of samples that is user-level differentially private (DP). At each time instant, a user contributes a sample, and the users can arrive in arbitrary order. Until now these requirements of continual release and user-level privacy were considered in isolation. But, in practice, both these requirements come together as the users often contribute data repeatedly and multiple queries are made. We provide an algorithm that outputs a mean estimate at every time instant t such that the overall release is user-level \\n<inline-formula> <tex-math>$\\\\varepsilon $ </tex-math></inline-formula>\\n-DP and has the following error guarantee: Denoting by \\n<inline-formula> <tex-math>$m_{t}$ </tex-math></inline-formula>\\n the maximum number of samples contributed by a user, as long as \\n<inline-formula> <tex-math>$\\\\tilde {\\\\Omega }(1/\\\\varepsilon)$ </tex-math></inline-formula>\\n users have \\n<inline-formula> <tex-math>$m_{t}/2$ </tex-math></inline-formula>\\n samples each, the error at time t is \\n<inline-formula> <tex-math>$\\\\tilde {O}(1/\\\\sqrt {t}+\\\\sqrt {m}_{t}/t\\\\varepsilon)$ </tex-math></inline-formula>\\n. This is a universal error guarantee which is valid for all arrival patterns of the users. Furthermore, it (almost) matches the existing lower bounds for the single-release setting at all time instants when users have contributed equal number of samples.\",\"PeriodicalId\":73295,\"journal\":{\"name\":\"IEEE journal on selected areas in information theory\",\"volume\":\"5 \",\"pages\":\"28-43\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-02-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE journal on selected areas in information theory\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10443583/\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE journal on selected areas in information theory","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10443583/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Continual Mean Estimation Under User-Level Privacy
We consider the problem of continually releasing an estimate of the population mean of a stream of samples that is user-level differentially private (DP). At each time instant, a user contributes a sample, and the users can arrive in arbitrary order. Until now these requirements of continual release and user-level privacy were considered in isolation. But, in practice, both these requirements come together as the users often contribute data repeatedly and multiple queries are made. We provide an algorithm that outputs a mean estimate at every time instant t such that the overall release is user-level
$\varepsilon $
-DP and has the following error guarantee: Denoting by
$m_{t}$
the maximum number of samples contributed by a user, as long as
$\tilde {\Omega }(1/\varepsilon)$
users have
$m_{t}/2$
samples each, the error at time t is
$\tilde {O}(1/\sqrt {t}+\sqrt {m}_{t}/t\varepsilon)$
. This is a universal error guarantee which is valid for all arrival patterns of the users. Furthermore, it (almost) matches the existing lower bounds for the single-release setting at all time instants when users have contributed equal number of samples.