Finer-Grained Hardness of Kernel Density Estimation

Media type: Text; E-Article; Electronic Conference Proceeding

Title: Finer-Grained Hardness of Kernel Density Estimation

Contributor: Alman, Josh [Author]; Guan, Yunfeng [Author]

Published: Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2024

Language: English

DOI: https://doi.org/10.4230/LIPIcs.CCC.2024.35

Keywords: Fine-Grained Complexity ; Schur Polynomials ; Kernel Density Estimation

Origination:

Footnote: Diese Datenquelle enthält auch Bestandsnachweise, die nicht zu einem Volltext führen.

Description: In batch Kernel Density Estimation (KDE) for a kernel function f : ℝ^m × ℝ^m → ℝ, we are given as input 2n points x^{(1)}, …, x^{(n)}, y^{(1)}, …, y^{(n)} ∈ ℝ^m in dimension m, as well as a vector v ∈ ℝⁿ. These inputs implicitly define the n × n kernel matrix K given by K[i,j] = f(x^{(i)}, y^{(j)}). The goal is to compute a vector v ∈ ℝⁿ which approximates K w, i.e., with || Kw - v||_∞ < ε ||w||₁. For illustrative purposes, consider the Gaussian kernel f(x,y) : = e^{-||x-y||₂²}. The classic approach to this problem is the famous Fast Multipole Method (FMM), which runs in time n ⋅ O(log^m(ε^{-1})) and is particularly effective in low dimensions because of its exponential dependence on m. Recently, as the higher-dimensional case m ≥ Ω(log n) has seen more applications in machine learning and statistics, new algorithms have focused on this setting: an algorithm using discrepancy theory, which runs in time O(n / ε), and an algorithm based on the polynomial method, which achieves inverse polynomial accuracy in almost linear time when the input points have bounded square diameter B < o(log n). A recent line of work has proved fine-grained lower bounds, with the goal of showing that the "curse of dimensionality" arising in FMM is necessary assuming the Strong Exponential Time Hypothesis (SETH). Backurs et al. [NeurIPS 2017] first showed the hardness of a variety of Empirical Risk Minimization problems including KDE for Gaussian-like kernels in the case with high dimension m = Ω(log n) and large scale B = Ω(log n). Alman et al. [FOCS 2020] later developed new reductions in roughly this same parameter regime, leading to lower bounds for more general kernels, but only for very small error ε < 2^{- log^{Ω(1)} (n)}. In this paper, we refine the approach of Alman et al. to show new lower bounds in all parameter regimes, closing gaps between the known algorithms and lower bounds. For example: - In the setting where m = Clog n and B = o(log n), we prove Gaussian KDE requires n^{2-o(1)} time to achieve additive error ε ...

Access State: Open Access

Search in field:

Recently searched for: