Archive

Differentially Private GAN

Introduction - Propose a differentially private generative adversarial network - Uses the Wasserstein distance which is better than JS-divergence, i.e. WGAN Framework - GAN : minimax game between generator and discrim...

2024. 1. 29.3 min read

Introduction

  • Propose a differentially private generative adversarial network
  • Uses the Wasserstein distance which is better than JS-divergence, i.e. WGAN

Framework

  • GAN : minimax game between generator and discriminator.
minGmaxDEp(x)logD(x)+Ezq(z)log(1D(G(z)) \min_{G}\max_{D} \mathrm{E}_{p^{\ast}(x)}\log D(x) + \mathrm{E}_{z\sim q(z)}\log(1-D(G(z)) minGmaxwWEp(x)fw(x)+Ezq(z)fw(G(z)) \min_{G}\max_{w\in\mathcal{W}} \mathrm{E}_{p^{\ast}(x)}f_{w}(x) + \mathrm{E}_{z\sim q(z)}f_{w}(G(z))

How to achieve differential privacy during the learning algorithm?

Algorithm

DPGAN Algorithm

  • At the 7th line of the code, Gaussian noise is added to the gradient of the Wasserstein distances.
  • Line 9 : Weight Clipping (How about the gradient penalty term?)

Privacy Guaranty

  • θ\theta : Parameters of generator, defined through the discrimator parameters ww
  • ϵ\epsilon : Privacy budget (smaller budget guarantees higher level of privacy)

Consider the update procedure at the algorithm above, for a fixed t2t_{2}.

zp(z):a batch of prior sampesxp(x):a batch of real data pointsgw(xi,zi)w(fw(xi)fw(gθ(zi))gˉw=1mi=1mgw(xi,zi)+N(0,σn2cg2I)w(t2+1)=w(t2)+Gradient Descentw(t2+1)=clip(w(t2+1),cp,cp) \begin{align} &\mathbf{z}\sim p(z) : \text{a batch of prior sampes}\\ &\mathbf{x}\sim p^{\ast}(x) : \text{a batch of real data points}\\ & g_{w}(x_{i},z_{i}) \leftarrow \nabla_{w} \left(f_{w}(x_{i})-f_{w}(g_\theta(z_{i})\right)\\ &\bar g_{w}=\frac{1}{m}\sum_{i=1}^{m}g_{w}(x_{i},z_{i})+\mathcal{N}(\mathbf{0},\sigma_{n}^{2}c_{g}^{2}\mathbf{I})\\ &w^{(t_{2}+1)} = w^{(t_{2})} + \text{Gradient Descent}\\ &w^{(t_{2}+1)} =\text{clip}(w^{(t_{2}+1)}, -c_{p},c_{p}) \end{align}

Since the procedure above produces a new output w(t2+1)w^{(t_{2}+1)} with the dataset DD and an auxiliary input w(t2)w^{(t_{2})} and noise structure, it can be regarded as an algorithm, and write as

Ap(D)=M(aux,D) \mathcal{A}_{p}(D)=M(\text{aux},D)

Thus, we can define the privacy loss for MM as follows.

Definition

The privacy loss at oo is defined as

c(o;M,aux,D,D):=logPr(M(aux,D)=o)Pr(M(aux,D)=o) c(o;M,\text{aux},D,D') := \log \frac{\Pr(M(\text{aux},D)=o)}{\Pr(M(\text{aux},D')=o)}

where D,DD,D' are the neighboring datasets. Also, we can define the random variable CC of privacy loss as follows.

C(M,aux,D,D):=c(M(D);M,aux,D,D) C(M,\text{aux},D,D') := c(M(D);M,\text{aux},D,D')

Log moment generating function

αM(λ;aux,D,D):=logEoM(aux,D)[exp(λC(M,aux,D,D))] \alpha_{M}(\lambda;\text{aux},D,D') := \log \mathrm{E}_{o\sim M(\text{aux},D)}\left[\exp \left(\lambda C(M,\text{aux},D,D')\right)\right]

Moments accountant

αM(λ):=maxaux,D,DαM(λ;aux,D,D) \alpha_{M}(\lambda) := \max_{\text{aux},D,D'}\alpha_{M}(\lambda;\text{aux},D,D')

See (Abadi et al., 2016).

Lemma

Under the condition of Algorithm 1, assume that the activation function of the discriminator has a bounded range and bounded derivatives everywhere: σ()Bσ,σ()Bσ\sigma(\cdot)\le B_{\sigma}, \sigma'(\cdot)\le B_{\sigma'}, and input space is compact so that xBx\Vert \mathbf{x}\Vert\leq B_{x}. Then, gw(x(i),z(i))cg\left\Vert g_{w}(\mathbf{x}^{(i)},\mathbf{z}^{(i)})\right\Vert \le c_{g} for some constant cgc_{g}.

Remark. ReLU, Softplus activation functions are unbounded, but guarantees the boundness since the compactness of input space affects on the compactness of the function values.

Lemma 2

Given the sampling probability q=mMq= \frac{m}{M}, the number of discriminator iterations in each inner loop ndn_{d} and privacy violation δ\delta, for arbitrary ϵ>0\epsilon>0 the parameters of discriminator guarantee (ϵ,δ)(\epsilon,\delta)-privacy w.r.t. all the data points used in that outer loop if we choose:

σn=2qϵndlog(1δ) \sigma_{n}=\frac{2q}{\epsilon} \sqrt{n_{d}\log(\frac{1}{\delta})}

Proof. See paper

Theorem

The Algorithm 1 learns a generator which guarantees (ϵ,δ)(\epsilon,\delta)-DP.

References