Memory-Sample Lower Bounds for Learning Parity with Noise

Media type: Text; Electronic Conference Proceeding; E-Article

Title: Memory-Sample Lower Bounds for Learning Parity with Noise

Contributor: Garg, Sumegha [Author]; Kothari, Pravesh K. [Author]; Liu, Pengda [Author]; Raz, Ran [Author]

imprint: Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2021

Language: English

DOI: https://doi.org/10.4230/LIPIcs.APPROX/RANDOM.2021.60

Keywords: memory-sample tradeoffs ; space lower bound ; learning parity under noise ; branching program

Origination:

Footnote: Diese Datenquelle enthält auch Bestandsnachweise, die nicht zu einem Volltext führen.

Description: In this work, we show, for the well-studied problem of learning parity under noise, where a learner tries to learn x = (x₁,…,x_n) ∈ {0,1}ⁿ from a stream of random linear equations over 𝔽₂ that are correct with probability 1/2+ε and flipped with probability 1/2-ε (0 < ε < 1/2), that any learning algorithm requires either a memory of size Ω(n²/ε) or an exponential number of samples. In fact, we study memory-sample lower bounds for a large class of learning problems, as characterized by [Garg et al., 2018], when the samples are noisy. A matrix M: A × X → {-1,1} corresponds to the following learning problem with error parameter ε: an unknown element x ∈ X is chosen uniformly at random. A learner tries to learn x from a stream of samples, (a₁, b₁), (a₂, b₂) …, where for every i, a_i ∈ A is chosen uniformly at random and b_i = M(a_i,x) with probability 1/2+ε and b_i = -M(a_i,x) with probability 1/2-ε (0 < ε < 1/2). Assume that k,𝓁, r are such that any submatrix of M of at least 2^{-k} ⋅ |A| rows and at least 2^{-𝓁} ⋅ |X| columns, has a bias of at most 2^{-r}. We show that any learning algorithm for the learning problem corresponding to M, with error parameter ε, requires either a memory of size at least Ω((k⋅𝓁)/ε), or at least 2^{Ω(r)} samples. The result holds even if the learner has an exponentially small success probability (of 2^{-Ω(r)}). In particular, this shows that for a large class of learning problems, same as those in [Garg et al., 2018], any learning algorithm requires either a memory of size at least Ω(((log|X|)⋅(log|A|))/ε) or an exponential number of noisy samples. Our proof is based on adapting the arguments in [Ran Raz, 2017; Garg et al., 2018] to the noisy case.

Access State: Open Access

Search in field:

Recently searched for: