| title | GANs: Generative Adversarial Networks | |||||
|---|---|---|---|---|---|---|
| sidebar_label | GANs | |||||
| description | Understanding the competitive framework between Generators and Discriminators to create realistic synthetic data. | |||||
| tags |
|
Introduced by Ian Goodfellow in 2014, Generative Adversarial Networks (GANs) are a class of machine learning frameworks where two neural networks contest with each other in a game. This framework allows the model to learn how to generate new, synthetic data that is indistinguishable from real data.
A GAN consists of two distinct models that are trained simultaneously through competition:
-
The Generator (
$G$ ): Think of this as a forger. Its goal is to create realistic images (or data) from random noise to trick the discriminator. -
The Discriminator (
$D$ ): Think of this as a detective. Its goal is to distinguish between "real" data (from the training set) and "fake" data (produced by the generator).
The GAN training process is a "minimax" game where the Generator tries to minimize the probability that the Discriminator is correct, while the Discriminator tries to maximize it.
- The Generator takes random noise as input and produces a synthetic sample.
- The Discriminator receives both real samples and synthetic samples.
- Feedback Loop: * If the Detective (D) catches the Forger (G), G learns how to improve its forgery.
- If the Forger (G) tricks the Detective (D), D learns how to be a better investigator.
Eventually, the Generator becomes so good that the Discriminator can only guess with 50% accuracy (equivalent to a coin flip).
The entire system can be described by the following value function
-
$D(x)$ : Discriminator's estimate of the probability that real data$x$ is real. -
$G(z)$ : The Generator's output for a given noise$z$ . -
$D(G(z))$ : Discriminator's estimate of the probability that a fake sample is real.
The following diagram illustrates the interaction between the two networks and the data sources.
graph LR
Noise[Random Noise Z] --> Gen[Generator G]
Gen --> Fake[Fake Samples]
Real[Real Dataset X] --> Disc[Discriminator D]
Fake --> Disc
Disc --> Prediction{Real or Fake?}
Prediction -- Error Feedback --> Disc
Prediction -- Loss Signal --> Gen
Training GANs is notoriously difficult because of the delicate balance required between the two models:
- Mode Collapse: The Generator discovers a single "type" of output that tricks the Discriminator and keeps producing only that (e.g., a model supposed to generate all digits only generates the number "7").
- Vanishing Gradients: If the Discriminator is too good, the Generator doesn't get enough feedback to learn.
- Convergence: Unlike standard models, GANs may never reach a stable point, instead oscillating back and forth.
| Variant | Key Feature | Use Case |
|---|---|---|
| DCGAN | Uses Convolutional layers instead of Dense layers. | Generating high-quality images. |
| CycleGAN | Learns to translate images from one domain to another without paired data. | Turning photos into paintings (e.g., Zebra to Horse). |
| StyleGAN | Allows control over specific "styles" (age, hair color, etc.). | Generating hyper-realistic human faces. |
| Pix2Pix | Conditional GAN for image-to-image translation. | Converting sketches into realistic photos. |
import torch
import torch.nn as nn
# Simple Discriminator
discriminator = nn.Sequential(
nn.Linear(784, 128),
nn.LeakyReLU(0.2),
nn.Linear(128, 1),
nn.Sigmoid()
)
# Simple Generator
generator = nn.Sequential(
nn.Linear(100, 256),
nn.ReLU(),
nn.Linear(256, 784),
nn.Tanh() # Outputs pixels between -1 and 1
)- Original Paper: Generative Adversarial Networks (Goodfellow et al.)
- Google Developers: GANs Course
- This Person Does Not Exist: A showcase of StyleGAN capabilities
GANs are masters of generation, but they are hard to control. What if we wanted a model that can gradually "denoise" an image into existence?