| title | Forward Propagation | ||||
|---|---|---|---|---|---|
| sidebar_label | Forward Propagation | ||||
| description | Understanding how data flows from the input layer to the output layer to generate a prediction. | ||||
| tags |
|
Forward Propagation is the process by which a neural network transforms input data into an output prediction. It is the "inference" stage where data flows through the network layers, undergoing linear transformations and non-linear activations until it reaches the final layer.
In a dense (fully connected) network, the signal moves from left to right. For every neuron in a hidden or output layer, two distinct steps occur:
The neuron takes all inputs from the previous layer, multiplies them by their respective weights, and adds a bias term. This is essentially a multi-dimensional linear equation.
Where:
-
$x_i$ = input features from the previous layer -
$w_i$ = weights associated with each input -
$b$ = bias term
The result
In practice, we don't calculate one neuron at a time. We use Linear Algebra to calculate entire layers simultaneously. This is why GPUs (which are great at matrix math) are so important for Deep Learning.
If
Then, we apply the activation function:
This output
Imagine a simple network with 1 Hidden Layer:
graph LR
%% Input Layer
X1["$$x_1$$"] -->|"$$w_{11}^{[1]}$$"| H1
X2["$$x_2$$"] -->|"$$w_{12}^{[1]}$$"| H1
X3["$$x_3$$"] -->|"$$w_{13}^{[1]}$$"| H1
X1 -->|"$$w_{21}^{[1]}$$"| H2
X2 -->|"$$w_{22}^{[1]}$$"| H2
X3 -->|"$$w_{23}^{[1]}$$"| H2
%% Hidden Layer
H1["$$z_1^{[1]} \\ a_1^{[1]} = \sigma(z_1^{[1]})$$"]
H2["$$z_2^{[1]} \\ a_2^{[1]} = \sigma(z_2^{[1]})$$"]
%% Output Layer
H1 -->|"$$w_1^{[2]}$$"| Y
H2 -->|"$$w_2^{[2]}$$"| Y
Y["$$z^{[2]} \\ \hat{y} = \sigma(z^{[2]})$$"]
%% Bias annotations
B1["$$b^{[1]}$$"] -.-> H1
B1 -.-> H2
B2["$$b^{[2]}$$"] -.-> Y
- Input: Your features (e.g., pixel values of an image).
- Hidden Layer: Extracts abstract features (e.g., edges or shapes).
- Output Layer: Provides the final guess (e.g., "This is a dog with 92% probability").
The term "propagate" is used because the output of one layer is the input of the next. The information "spreads" through the network. Each layer acts as a filter, refining the raw data into more meaningful representations until a decision can be made at the end.
This snippet demonstrates the math behind a single forward pass for a network with one hidden layer.
import numpy as np
def sigmoid(x):
return 1 / (1 + np.exp(-x))
# 1. Inputs (3 features)
X = np.array([0.5, 0.1, -0.2])
# 2. Weights and Biases (Hidden Layer with 2 neurons)
W1 = np.random.randn(2, 3)
b1 = np.random.randn(2)
# 3. Weights and Biases (Output Layer with 1 neuron)
W2 = np.random.randn(1, 2)
b2 = np.random.randn(1)
# --- FORWARD PASS ---
# Layer 1 (Hidden)
z1 = np.dot(W1, X) + b1
a1 = sigmoid(z1)
# Layer 2 (Output)
z2 = np.dot(W2, a1) + b2
prediction = sigmoid(z2)
print(f"Model Prediction: {prediction}")Forward propagation gives us a prediction. However, at the start, the weights are random, so the prediction will be wrong. To make the model "learn," we must:
- Compare the prediction to the truth using a Loss Function.
- Send the error backward through the network using Backpropagation.
- DeepLearning.AI: Neural Networks and Deep Learning (Week 2)
- Khan Academy: Matrix Multiplication Foundations
We have the prediction. Now, how do we tell the network it made a mistake? Head over to the Backpropagation guide to learn how neural networks learn from their errors!