Accurate estimation of probability density functions is essential for probabilistic modeling but remains a major challenge, particularly for large-scale, multidimensional datasets. Kernel Density Estimation (KDE), one of the most widely used nonparametric methods, has been extensively studied. However, for such datasets, KDE is limited by high computational cost, suboptimal bandwidth selection, and density leakage. To address these limitations, we propose a method that reformulates bandwidth selection as a gradient-based optimization task in the frequency domain, thereby simultaneously resolving these three shortcomings. In this framework, the data are discretized and transformed using the discrete cosine transform, which decouples computational complexity from dataset size. We then construct a differentiable objective function that integrates a frequency-domain fidelity loss with a regularization penalty and stabilize it with a normalization scheme. The optimal bandwidth vector is obtained by minimizing this function with the Adam optimizer and its analytical gradient. This approach outperforms classical and transformation-based estimators, as well as Copula models, in both efficiency and accuracy, while achieving results comparable to specialized asymmetric product kernels at a much lower computational cost. Overall, the proposed method provides a reliable solution for one to multi-dimensional data-driven density estimation.
扫码关注我们
求助内容:
应助结果提醒方式:
