Training and Evaluating Neural Networks

Journey into Machine Learning with Sklearn and Tensorflow

Introduction to Neural Networks with TensorFlowLesson 6

Lesson 6

Topic Overview

Today, we dive into an integral piece of the deep learning puzzle: training your neural network. In this lesson, we will demystify what training entails and learn how to implement it using TensorFlow. By training the model, the neural network learns from the input data, gradually adjusting its parameters (weights and biases) to minimize the error in its predictions.

Importance of Training a Neural Network

Training a neural network is akin to teaching a child to recognize shapes. The child learns from repeated exposure and feedback, just as a neural network learns from training datasets. In practice, the training process involves several rounds of forwarding input data through the network, calculating the error (the difference between the network's output and the actual, desired output), and adjusting the weights and biases to minimize this error. This process is much like a child adjusting their understanding of shapes based on feedback!

This iterative method allows the neural network to learn independently from the data and can eventually lead to accurate predictions or classifications, thereby enabling us to create powerful and predictive models.

Understanding the `model.fit()` Method

The model.fit() method in TensorFlow is our main tool for training a neural network. This method takes in inputs and their corresponding target values, fitting the model to this data over a certain number of iterations known as epochs. Here are the key parameters we need to understand:

X: Input data. This is the data from which your model will learn.
y: Target data. These are the answers or results that your model should learn to predict.
epochs: One epoch is one complete pass through the entire training dataset.
batch_size: This is the number of samples per gradient update. It's akin to breaking our dataset into smaller chunks, updating our model's learning parameters after each chunk.
validation_split: This value (between 0 and 1) determines the fraction of your training data that should be set aside for validation. Validation data guides the training process by providing a measure of model performance on unseen data.

Let's see this method in action with some code:

Python
1from sklearn.datasets import load_digits
2from sklearn.model_selection import train_test_split
3from keras.utils import to_categorical
4from keras.models import Sequential
5from keras.layers import Dense
6
7# Load data
8digits = load_digits()
9X = digits.data
10y = digits.target
11
12# Convert to one-hot encoding
13y = to_categorical(y)
14
15# Split the data into training and test sets
16X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2)
17
18# Create model
19model = Sequential()
20model.add(Dense(64, input_dim=len(X[0]), activation='relu'))
21model.add(Dense(len(y[0]), activation='softmax'))
22
23# Compile model
24model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
25
26# Train the model
27history = model.fit(X_train, y_train, epochs=5, batch_size=32, validation_split=0.2)

Visualizing Training History

Understanding how well the training is progressing is key when training a model. This is where the 'history' object from model.fit() comes into play. This object contains information about the training and validation accuracy and loss for each epoch, which we can use to track the learning progress of our network.

We can extract this data and visualize them using plots. The following code creates a simple line plot for the training and validation accuracy for each epoch, which gives us insight into how effectively our model is learning.

Python
1import matplotlib.pyplot as plt
2
3# Plot the training history
4plt.plot(history.history['accuracy'], label='accuracy') # Plotting training accuracy
5plt.plot(history.history['val_accuracy'], label = 'val_accuracy') # Plotting validation accuracy
6plt.xlabel('Epoch') # Label for x-axis
7plt.ylabel('Accuracy') # Label for y-axis
8plt.ylim([0, 1]) # Setting limit for y-axis
9plt.legend(loc='lower right') # Positioning legend
10plt.show()  # Displaying the plot

ouput:


1 1/36 [..............................] - ETA: 0s - loss: 0.4314 - accuracy: 0.8125
213/36 [=========>....................] - ETA: 0s - loss: 0.3962 - accuracy: 0.8774
325/36 [===================>..........] - ETA: 0s - loss: 0.3419 - accuracy: 0.8875
436/36 [==============================] - 0s 7ms/step - loss: 0.3615 - accuracy: 0.8842 - val_loss: 0.3589 - val_accuracy: 0.8785

In the graph, you can see the accuracy (both 'training' and 'validation') plotted against the number of epochs. An epoch is one round of passing all samples through the model. This visualization provides a view into the progression of learning and can guide the adjustments needed in the learning parameters if necessary.

You can notice that with each passing epoch our accuracy measure (metric against the training set) and our val_accuracy (metric against the validation set, i.e. unseen data) gets better but we are still below 0.9 on both. How can we increase it? Given the trend we are seeing, increasing the number of epochs should be the right approach.

Improving our model

When we double the number of epochs we get the following code:

Python
1from sklearn.datasets import load_digits
2from sklearn.model_selection import train_test_split
3from keras.utils import to_categorical
4from keras.models import Sequential
5from keras.layers import Dense
6
7# Load data
8digits = load_digits()
9X = digits.data
10y = digits.target
11
12# Convert to one-hot encoding
13y = to_categorical(y)
14
15# Split the data into training and test sets
16X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2)
17
18# Create model
19model = Sequential()
20model.add(Dense(64, input_dim=len(X[0]), activation='relu'))
21model.add(Dense(len(y[0]), activation='softmax'))
22
23# Compile model
24model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
25
26# Train the model
27history = model.fit(X_train, y_train, epochs=10, batch_size=32, validation_split=0.2)
28
29import matplotlib.pyplot as plt
30
31# Plot the training history
32plt.plot(history.history['accuracy'], label='accuracy') # Plotting training accuracy
33plt.plot(history.history['val_accuracy'], label = 'val_accuracy') # Plotting validation accuracy
34plt.xlabel('Epoch') # Label for x-axis
35plt.ylabel('Accuracy') # Label for y-axis
36plt.ylim([0, 1]) # Setting limit for y-axis
37plt.legend(loc='lower right') # Positioning legend
38plt.show()  # Displaying the plot

output


1 1/36 [..............................] - ETA: 0s - loss: 0.0531 - accuracy: 1.0000
214/36 [==========>...................] - ETA: 0s - loss: 0.1044 - accuracy: 0.9621
325/36 [===================>..........] - ETA: 0s - loss: 0.1212 - accuracy: 0.9625
436/36 [==============================] - ETA: 0s - loss: 0.1192 - accuracy: 0.9669
536/36 [==============================] - 0s 7ms/step - loss: 0.1192 - accuracy: 0.9669 - val_loss: 0.1665 - val_accuracy: 0.9514

As you can see we now have above 0.95 accuracy on both training and cross-validation data and simply increasing the number of epochs is unlikely to change anything since our val_accuracy curve has mostly flatlined. This metric is already excellent so we don't need to tweak the model much more, however, if we needed to, one of the first things to try would be to tweak the network architecture. For example, you could add more layers or adjust the width of the layers.

We Are Almost There

You have wonderfully navigated your way through the intricacies of training and evaluating a neural network in this lesson. You've grasped the vital role of training a neural network and understood how it can be performed and visualized using TensorFlow.

We have a few more exercises for you to complete and you'll be done with this course. Let's do this!

Enjoy this lesson? Now it's time to practice with Cosmo!

Practice is how you turn knowledge into actual skills.