Differential Privacy(DP) In our experiments, the training set is image-label pairs, and we say that d and d′ are “adjacent” when, given (image, label), we have a particular pair for d and no such pair for d′.
The idea behind basic differential privacy
The difference in results should not be large (less than epsilon) when certain data is present or absent.
The original definition did not have a final \delta term, but we added one to account for the possibility that \epsilon differential privacy could be broken with a probability of \delta.
To define such a function f from D -> R, a common methodology is to add a scaled noise to f’s sensitivity. This sensitivity is defined as the maximum of |f(d) - f(d’)|.
It consists of 1) differentially private SGD 2) moments accountant 3) hyper-parameter tuning.
differentially private SGD

moments accountant
hyper-parameter tuning