Revised: November 5, 2021
Published: August 6, 2022
Abstract: [Plain Text Version]
Most known algorithms in the streaming model of computation aim to approximate a single function such as an $\ell_p$ norm.
In 2009, Nelson (https://sublinear.info, Open Problem 30) asked if it is possible to design universal algorithms, that simultaneously approximate multiple functions of the stream. In this paper we answer the question of Nelson for the class of subset-$\ell_0$ norms in the insertion-only frequency-vector model. Given a family of subsets, $\cS\subset 2^{[n]}$, we provide a single streaming algorithm that can $(1\pm \epsilon)$-approximate the subset-$\ell_p$ norm for every $S\in\cS$. Here, the subset-$\ell_p$ norm of $v\in \RR^n$ with respect to the set $S\subseteq [n]$ is the $\ell_p$ norm of $v_{|S}$ (the vector $v$ restricted to $S$ by zeroing all other coordinates).
Our main result is a nearly tight characterization of the space complexity of the subset-$\ell_0$ norm for every family $\cS\subset 2^{[n]}$ in insertion-only streams, expressed in terms of the “heavy-hitter dimension” of $\cS$, a new combinatorial quantity related to the VC-dimension of $\cS$. We also show that the more general turnstile and sliding-window models require a much larger space usage. All these results easily extend to the $\ell_1$ norm.
In addition, we design algorithms for two other subset-$\ell_p$ variants. These can be compared to the famous Priority Sampling algorithm of Duffield, Lund and Thorup [JACM 2007], which achieves additive approximation $\epsilon ||v||_1$ for all possible subsets ($\cS=2^{[n]}$) in the entrywise update model. One of our algorithms extends their algorithm to handle turnstile updates, and another one achieves multiplicative approximation, given a family $\cS$.