Private Learning and Sanitization: Pure vs. Approximate Differential Privacy

Beimel, Amos; Nissim, Kobbi; Stemmer, Uri

doi:10.4086/toc.2016.v012a001

Volume 12 (2016) Article 1 pp. 1-61
APPROX-RANDOM 2013 Special Issue

Private Learning and Sanitization: Pure vs. Approximate Differential Privacy

by Amos Beimel, Kobbi Nissim, and Uri Stemmer

Received: September 11, 2013
Revised: May 18, 2014
Published: May 27, 2016

Download article from ToC site:
[PDF (567K)] [PS (3722K)] [Source ZIP]

Misc.:
[About the Authors] [HTML Bibliography] [BibTeX]

Keywords: differential privacy, sample complexity, private learning, sanitization

Categories: cryptography, differential privacy, sample complexity, private learning, sanitization, learning, special issue, RANDOM, APPROX-RANDOM 2013 special issue

ACM Classification: K.4.1, I.2.6, F.2.0

AMS Classification: 68Q32, 68Q25, 68W20

Abstract: [Plain Text Version]

We compare the sample complexity of private learning [Kasiviswanathan et al. 2008] and sanitization [Blum et al. 2008] under pure $\epsilon$-differential privacy [Dwork et al. TCC 2006] and approximate $(\epsilon,\delta)$-differential privacy [Dwork et al. Eurocrypt 2006]. We show that the sample complexity of these tasks under approximate differential privacy can be significantly lower than that under pure differential privacy.

We define a family of optimization problems, which we call Quasi-Concave Promise Problems, that generalizes some of the tasks we consider. We observe that a quasi-concave promise problem can be privately approximated using a solution to a smaller instance of a quasi-concave promise problem. This allows us to construct an efficient recursive algorithm to solve such problems privately. Specifically, we construct private learners for point functions, threshold functions, and axis-aligned rectangles in high dimension. Similarly, we construct sanitizers for point functions and threshold functions.

We also examine the sample complexity of label-private learners, a relaxation of private learning where the learner is required to only protect the privacy of the labels in the sample. We show that the VC dimension completely characterizes the sample complexity of such learners, that is, the sample complexity of learning with label privacy is equal (up to constants) to learning without privacy.

An extended abstract of this paper appeared in the Proceedings of the 17th International Workshop on Randomization and Computation, 2013.

DOI: 10.4086/toc.2016.v012a001

Licensed under a Creative Commons Attribution License (CC-BY)