Backpropagation Unveiled: Understanding the Mathematics and Code Behind Neural Network Learning

Lesson 3

Introduction

Hello! In this lesson, we'll thoroughly examine the inner workings of the crucial backpropagation algorithm in training neural networks and create it from scratch in Python.

The Structure of a Neural Network

A neural network consists of an input layer, one or more hidden layers, and an output layer. Each layer houses neurons, or nodes interconnected through links attributed with weights. These weights and bias terms dictate the network's output. In our Python code, the size of the input layer adjusts according to the shape of self.input. The hidden layer hosts four neurons (self.weights1), and the output layer accommodates one neuron (self.weights2).

Understanding the Sigmoid Function

Our activation function, the sigmoid function, transforms real-value numbers into a range between 0 and 1. Let's recall its mathematical definition:

$sigmoid(x) = \frac{1}{1+e^{-x}}$

The derivative of the sigmoid function plays an essential role in backpropagation for weight updates. It is represented as:

$sigmoid\_derivative(x) = x * (1 - x)$

These functions are implemented in Python as sigmoid(x) and sigmoid_derivative(x).

Python
1def sigmoid(x):
2    return 1.0/(1+ np.exp(-x))
3
4def sigmoid_derivative(x):
5    return x * (1.0 - x)

Defining a Neural Network

The following methods will be defined in a class initialized like this:

Python
1class NeuralNetwork:
2    def __init__(self, x, y, learning_rate=0.1):
3        self.input = x
4        self.weights1   = np.random.rand(self.input.shape[1],4)
5        self.weights2   = np.random.rand(4,1)
6        self.y = y
7        self.output = np.zeros(self.y.shape)
8        self.learning_rate = learning_rate

The self.weights1 and self.weights2 parameters here refer to the weights of the connections from the input layer to the first hidden layer and from the first hidden layer to the output layer, respectively.

The self.y stores the target data in the instance.

The self.output creates a Numpy array filled with zeroes to hold the neural network's output.

Feedforward Propagation

Feedforward propagation involves data moving from the input layer to the output layer, passing through the hidden layers. The inputs and corresponding weights multiply, and the resultant values are processed through the activation function (the sigmoid function, in this scenario).

Python
1def feedforward(self):
2    # Implements feedforward method using dot product and sigmoid function
3    self.layer1 = sigmoid(np.dot(self.input, self.weights1))
4    self.output = sigmoid(np.dot(self.layer1, self.weights2))

The Essence of Backpropagation

Backpropagation is crucial to the learning process of neural networks. It corrects the network's error by propagating the error from the output layer back to the input layer, adjusting the weights to minimize the discrepancy between predicted (self.output) and actual outputs (self.y). This process is mathematically presented as:

$\Delta w_{ij} = \eta * e_{j} * x_{i}$

Where:

$\Delta w_{ij}$ denotes the magnitude of weight adjustment
$\eta$ represents the learning rate, dictating the pace at which our model learns
$e_{j}$ designates the error term for output unit $j$ , representing the difference between the predicted and actual output
$x_{i}$ identifies the input associated with the weight

In the backprop function, the error indicates the requirement for weight adjustments.

Implementing Backpropagation

In the backprop function, the derivatives of weights (d_weights2 and d_weights1) are computed using the error and the derivatives of the neuron outputs. Afterward, the weights are updated based on these derivatives.

Python
1def backprop(self):
2    # Performs backpropagation and updates weights
3    d_weights2 = np.dot(self.layer1.T, (2*(self.y - self.output) * sigmoid_derivative(self.output)))
4    d_weights1 = np.dot(self.input.T, (np.dot(2*(self.y - self.output) * sigmoid_derivative(self.output), self.weights2.T) * sigmoid_derivative(self.layer1)))
5
6    self.weights1 += self.learning_rate * d_weights1
7    self.weights2 += self.learning_rate * d_weights2

Epochs in Neural Network Training

An epoch signifies a complete pass through the entire training dataset. The train function applies backpropagation repeatedly over several epochs to adjust weights and minimize error. Multiple epochs provide the model with numerous opportunities to learn and correct its errors until it finds the most optimal weights for predictions.

Python
1def train(self, epochs):
2    # Repeatedly performs feedforward and backpropagation for several epochs
3    for epoch in range(epochs):
4        self.feedforward()
5        self.backprop()

Now we can define the predict method.

Python
1def predict(self, new_input):
2    layer1 = sigmoid(np.dot(new_input, self.weights1))
3    output = sigmoid(np.dot(layer1, self.weights2))
4    return output

The predict function computes outputs for given inputs by propagating inputs through layers with dot product operations and using the sigmoid function as the activation function. It is very similar to the feedforward function.

End-to-End Example: XOR Problem

Let's implement these concepts for the XOR (exclusive OR) problem. In this problem, accurate results depend on the parity of the number of true inputs. We initialize our neural network with inputs X, corresponding outputs Y, and train it over 1500 epochs. The weights adjust accordingly, enabling the correct prediction of the XOR problem.

Python
1X = np.array([[0, 0, 1],
2              [0, 1, 1],
3              [1, 0, 1],
4              [1, 1, 1]])
5Y = np.array([[0], [1], [1], [0]])
6nn = NeuralNetwork(X, Y)
7
8nn.train(10000)
9print("\nPredictions:")
10for i, x in enumerate(X):
11    print(f"Input: {x} ---> Prediction: {nn.predict(np.array([x]))}, Expected: {Y[i]}")

Lesson Summary and Practice

Congratulations! You've dissected the fundamental backpropagation algorithm, comprehended the mathematics underpinning it, and manifested it from scratch in Python. Implement, learn, and observe the transformations they bring about in your neural network output. Keep exploring and enjoy your voyage through deep learning!

Enjoy this lesson? Now it's time to practice with Cosmo!

Practice is how you turn knowledge into actual skills.