tutorial/ai-ml/machine-learning/deep-learning/gans.mdx at b098e2c66e44a01782a3b783cf790e36f4a5f30e · codeharborhub/tutorial

title

GANs: Generative Adversarial Networks

sidebar_label

GANs

description

Understanding the competitive framework between Generators and Discriminators to create realistic synthetic data.

1. The Adversarial Concept: The Forger and the Detective

A GAN consists of two distinct models that are trained simultaneously through competition:

The Generator ($G$): Think of this as a forger. Its goal is to create realistic images (or data) from random noise to trick the discriminator.
The Discriminator ($D$): Think of this as a detective. Its goal is to distinguish between "real" data (from the training set) and "fake" data (produced by the generator).

2. The Training Process: A Zero-Sum Game

The GAN training process is a "minimax" game where the Generator tries to minimize the probability that the Discriminator is correct, while the Discriminator tries to maximize it.

The Generator takes random noise as input and produces a synthetic sample.
The Discriminator receives both real samples and synthetic samples.
Feedback Loop: * If the Detective (D) catches the Forger (G), G learns how to improve its forgery.
- If the Forger (G) tricks the Detective (D), D learns how to be a better investigator.

Eventually, the Generator becomes so good that the Discriminator can only guess with 50% accuracy (equivalent to a coin flip).

3. Mathematical Objective

The entire system can be described by the following value function $V(D, G)$:

$$ \min_G \max_D V(D, G) = \mathbb{E}_{x \sim p_{data}(x)}[\log D(x)] + \mathbb{E}_{z \sim p_z(z)}[\log(1 - D(G(z)))] $$

$D(x)$: Discriminator's estimate of the probability that real data $x$ is real.
$G(z)$: The Generator's output for a given noise $z$.
$D(G(z))$: Discriminator's estimate of the probability that a fake sample is real.

4. Architectural Flow (Mermaid)

The following diagram illustrates the interaction between the two networks and the data sources.

graph LR
    Noise[Random Noise Z] --> Gen[Generator G]
    Gen --> Fake[Fake Samples]
    Real[Real Dataset X] --> Disc[Discriminator D]
    Fake --> Disc
    
    Disc --> Prediction{Real or Fake?}
    Prediction -- Error Feedback --> Disc
    Prediction -- Loss Signal --> Gen

5. Challenges in Training GANs

Training GANs is notoriously difficult because of the delicate balance required between the two models:

Mode Collapse: The Generator discovers a single "type" of output that tricks the Discriminator and keeps producing only that (e.g., a model supposed to generate all digits only generates the number "7").
Vanishing Gradients: If the Discriminator is too good, the Generator doesn't get enough feedback to learn.
Convergence: Unlike standard models, GANs may never reach a stable point, instead oscillating back and forth.

6. Popular GAN Variants

Variant	Key Feature	Use Case
DCGAN	Uses Convolutional layers instead of Dense layers.	Generating high-quality images.
CycleGAN	Learns to translate images from one domain to another without paired data.	Turning photos into paintings (e.g., Zebra to Horse).
StyleGAN	Allows control over specific "styles" (age, hair color, etc.).	Generating hyper-realistic human faces.
Pix2Pix	Conditional GAN for image-to-image translation.	Converting sketches into realistic photos.

7. Implementation Sketch (PyTorch)

import torch
import torch.nn as nn

# Simple Discriminator
discriminator = nn.Sequential(
    nn.Linear(784, 128),
    nn.LeakyReLU(0.2),
    nn.Linear(128, 1),
    nn.Sigmoid()
)

# Simple Generator
generator = nn.Sequential(
    nn.Linear(100, 256),
    nn.ReLU(),
    nn.Linear(256, 784),
    nn.Tanh() # Outputs pixels between -1 and 1
)

References

Original Paper: Generative Adversarial Networks (Goodfellow et al.)
Google Developers: GANs Course
This Person Does Not Exist: A showcase of StyleGAN capabilities

GANs are masters of generation, but they are hard to control. What if we wanted a model that can gradually "denoise" an image into existence?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

1. The Adversarial Concept: The Forger and the Detective

2. The Training Process: A Zero-Sum Game

3. Mathematical Objective

4. Architectural Flow (Mermaid)

5. Challenges in Training GANs

6. Popular GAN Variants

7. Implementation Sketch (PyTorch)

References

Uh oh!

FilesExpand file tree

gans.mdx

Latest commit

History

gans.mdx

File metadata and controls

1. The Adversarial Concept: The Forger and the Detective

2. The Training Process: A Zero-Sum Game

3. Mathematical Objective

4. Architectural Flow (Mermaid)

5. Challenges in Training GANs

6. Popular GAN Variants

7. Implementation Sketch (PyTorch)

References