Lightweight Protocols for Distributed Private Quantile Estimation
Anders Aamand, Fabrizio Boninsegna, Abigail Gentle, Jacob Imola, Rasmus Pagh
Learning one quantile (such as the median) is as hard as learning the entire CDF, motivating adaptive algorithms for this problem under local and shuffle differential privacy.
Abstract
Distributed data analysis is a large and growing field driven by a massive proliferation of user devices, and by privacy concerns surrounding the centralised storage of data. We consider two adaptive algorithms for estimating one quantile (e.g. the median) when each user holds a single data point lying in a domain \([B]\) that can be queried once through a private mechanism; one under local differential privacy (LDP) and another for shuffle differential privacy (shuffle-DP). In the adaptive setting we present an \(\varepsilon\)-LDP algorithm which can estimate any quantile within error \(\alpha\) only requiring \(O(\frac{\log B}{\varepsilon^2\alpha^2})\) users, and an \((\varepsilon,\delta)\)-shuffle DP algorithm requiring only \(\widetilde{O}((\frac{1}{\varepsilon^2}+\frac{1}{\alpha^2})\log B)\) users. Prior (nonadaptive) algorithms require more users by several logarithmic factors in \(B\). We further provide a matching lower bound for adaptive protocols, showing that our LDP algorithm is optimal in the low-\(\varepsilon\) regime. Additionally, we establish lower bounds against non-adaptive protocols which paired with our understanding of the adaptive case, proves a fundamental separation between these models.