Lesson 3

Hello! In this lesson, we'll thoroughly examine the inner workings of the crucial **backpropagation algorithm** in training neural networks and create it from scratch in Python.

A neural network consists of an input layer, one or more hidden layers, and an output layer. Each layer houses neurons, or nodes interconnected through links attributed with weights. These weights and bias terms dictate the network's output. In our Python code, the size of the input layer adjusts according to the shape of `self.input`

. The hidden layer hosts four neurons (`self.weights1`

), and the output layer accommodates one neuron (`self.weights2`

).

Our activation function, the sigmoid function, transforms real-value numbers into a range between 0 and 1. Let's recall its mathematical definition:

$sigmoid(x) = \frac{1}{1+e^{-x}}$

The derivative of the sigmoid function plays an essential role in backpropagation for weight updates. It is represented as:

$sigmoid\_derivative(x) = x * (1 - x)$

These functions are implemented in Python as `sigmoid(x)`

and `sigmoid_derivative(x)`

.

Python`1def sigmoid(x): 2 return 1.0/(1+ np.exp(-x)) 3 4def sigmoid_derivative(x): 5 return x * (1.0 - x)`

The following methods will be defined in a class initialized like this:

Python`1class NeuralNetwork: 2 def __init__(self, x, y, learning_rate=0.1): 3 self.input = x 4 self.weights1 = np.random.rand(self.input.shape[1],4) 5 self.weights2 = np.random.rand(4,1) 6 self.y = y 7 self.output = np.zeros(self.y.shape) 8 self.learning_rate = learning_rate`

The `self.weights1`

and `self.weights2`

parameters here refer to the weights of the connections from the input layer to the first hidden layer and from the first hidden layer to the output layer, respectively.

The `self.y`

stores the target data in the instance.

The `self.output`

creates a Numpy array filled with zeroes to hold the neural network's output.

Feedforward propagation involves data moving from the input layer to the output layer, passing through the hidden layers. The inputs and corresponding weights multiply, and the resultant values are processed through the activation function (the sigmoid function, in this scenario).

Python`1def feedforward(self): 2 # Implements feedforward method using dot product and sigmoid function 3 self.layer1 = sigmoid(np.dot(self.input, self.weights1)) 4 self.output = sigmoid(np.dot(self.layer1, self.weights2))`

Backpropagation is crucial to the learning process of neural networks. It corrects the network's error by propagating the error from the output layer back to the input layer, adjusting the weights to minimize the discrepancy between predicted (`self.output`

) and actual outputs (`self.y`

). This process is mathematically presented as:

$\Delta w_{ij} = \eta * e_{j} * x_{i}$

Where:

- $\Delta w_{ij}$ denotes the magnitude of weight adjustment
- $\eta$ represents the learning rate, dictating the pace at which our model learns
- $e_{j}$ designates the error term for output unit $j$, representing the difference between the predicted and actual output
- $x_{i}$ identifies the input associated with the weight

In the `backprop`

function, the error indicates the requirement for weight adjustments.

In the `backprop`

function, the derivatives of weights (`d_weights2`

and `d_weights1`

) are computed using the error and the derivatives of the neuron outputs. Afterward, the weights are updated based on these derivatives.

Python`1def backprop(self): 2 # Performs backpropagation and updates weights 3 d_weights2 = np.dot(self.layer1.T, (2*(self.y - self.output) * sigmoid_derivative(self.output))) 4 d_weights1 = np.dot(self.input.T, (np.dot(2*(self.y - self.output) * sigmoid_derivative(self.output), self.weights2.T) * sigmoid_derivative(self.layer1))) 5 6 self.weights1 += self.learning_rate * d_weights1 7 self.weights2 += self.learning_rate * d_weights2`

An epoch signifies a complete pass through the entire training dataset. The `train`

function applies backpropagation repeatedly over several epochs to adjust weights and minimize error. Multiple epochs provide the model with numerous opportunities to learn and correct its errors until it finds the most optimal weights for predictions.

Python`1def train(self, epochs): 2 # Repeatedly performs feedforward and backpropagation for several epochs 3 for epoch in range(epochs): 4 self.feedforward() 5 self.backprop()`

Now we can define the `predict`

method.

Python`1def predict(self, new_input): 2 layer1 = sigmoid(np.dot(new_input, self.weights1)) 3 output = sigmoid(np.dot(layer1, self.weights2)) 4 return output`

The `predict`

function computes outputs for given inputs by propagating inputs through layers with dot product operations and using the sigmoid function as the activation function. It is very similar to the `feedforward`

function.

Let's implement these concepts for the XOR (exclusive OR) problem. In this problem, accurate results depend on the parity of the number of true inputs. We initialize our neural network with inputs `X`

, corresponding outputs `Y`

, and train it over 1500 epochs. The weights adjust accordingly, enabling the correct prediction of the XOR problem.

Python`1X = np.array([[0, 0, 1], 2 [0, 1, 1], 3 [1, 0, 1], 4 [1, 1, 1]]) 5Y = np.array([[0], [1], [1], [0]]) 6nn = NeuralNetwork(X, Y) 7 8nn.train(10000) 9print("\nPredictions:") 10for i, x in enumerate(X): 11 print(f"Input: {x} ---> Prediction: {nn.predict(np.array([x]))}, Expected: {Y[i]}")`

Congratulations! You've dissected the fundamental backpropagation algorithm, comprehended the mathematics underpinning it, and manifested it from scratch in Python. Implement, learn, and observe the transformations they bring about in your neural network output. Keep exploring and enjoy your voyage through deep learning!