Data Structures for Density Estimation

Proceedings of the ... International Conference on Machine Learning. International Conference on Machine Learning Pub Date : 2023-06-20 DOI:10.48550/arXiv.2306.11312

Anders Aamand, Alexandr Andoni, Justin Y. Chen, P. Indyk, Shyam Narayanan, Sandeep Silwal

引用次数: 1

Abstract

We study statistical/computational tradeoffs for the following density estimation problem: given $k$ distributions $v_1, \ldots, v_k$ over a discrete domain of size $n$, and sampling access to a distribution $p$, identify $v_i$ that is"close"to $p$. Our main result is the first data structure that, given a sublinear (in $n$) number of samples from $p$, identifies $v_i$ in time sublinear in $k$. We also give an improved version of the algorithm of Acharya et al. (2018) that reports $v_i$ in time linear in $k$. The experimental evaluation of the latter algorithm shows that it achieves a significant reduction in the number of operations needed to achieve a given accuracy compared to prior work.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

密度估计的数据结构

我们研究了以下密度估计问题的统计/计算权衡:给定$k$分布$v_1， \ldots, v_k$在大小为$n$的离散域上，以及对分布$p$的抽样访问，确定$v_i$“接近”$p$。我们的主要结果是第一个数据结构，给定来自$p$的次线性(在$n$中)样本数量，识别$k$中的次线性时间$v_i$。我们还给出了Acharya等人(2018)算法的改进版本，该算法在$k$中报告$v_i$的时间线性。后一种算法的实验评估表明，与之前的工作相比，它实现了实现给定精度所需的操作次数的显着减少。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Proceedings of the ... International Conference on Machine Learning. International Conference on Machine Learning

自引率

0.00%

发文量