Utilizing Callbacks in TensorFlow

Lesson 3

Lesson Overview

Welcome! Today's agenda is focused on delving deeper into an important aspect of model optimization using TensorFlow - Callbacks. We'll explore what they are, their importance, and showcase the implementation of different types of callbacks in TensorFlow. By the end of this lesson, your understanding of callbacks and how to utilize them in TensorFlow would have notably improved. Let's get started!

Understanding Callbacks

Let's first understand what callbacks are. In TensorFlow, a callback is a Python class that you can customize to perform specific actions at different stages of training, like at the beginning or end of a batch or epoch, during testing, and predicting. They are useful for monitoring internal states and statistics of the model while it is being trained, making it easier to manage and optimize the training process. For example, you are already familiar with the EarlyStopping callback from a previous lesson, which stops training when a monitored metric stops improving.

The primary use of callbacks is during the model training phase, where they are passed as a list to the .fit() method of the Sequential or Model classes. They allow custom actions to occur at different intervals of training for purposes such as stopping training early under certain conditions, saving model weights to the disk, adjusting the learning rate values, or even custom actions defined by users.

Callbacks can be a helpful tool, especially when training larger models, as they provide us with more control over the training process.

In this lesson, we will cover three important types of callbacks:

ModelCheckpoint: This callback allows you to save the model's state at different stages of training, ensuring you don't lose progress and can store the best-performing model.
LearningRateScheduler: This callback provides a mechanism to adjust the learning rate dynamically based on the epoch, helping the model converge more efficiently.
Custom Callbacks: You will learn how to create your custom actions to be executed during the training process, offering flexibility for unique requirements.

By mastering these callbacks, you will have greater flexibility and control over your model training workflows.

Recap: Loading Data and Defining the Model

Before we delve into the use of callbacks, let’s recap the steps to load our preprocessed data and define our model. We will use a function from a secondary file named data_preprocessing.py to load the preprocessed Iris dataset, which we learned in previous lessons:

Python
1import tensorflow as tf
2from data_preprocessing import load_preprocessed_data
3
4# Load preprocessed Iris dataset
5X_train, X_test, y_train, y_test = load_preprocessed_data()
6
7# Define the model
8model = tf.keras.Sequential([
9    tf.keras.layers.Input(shape=(4,)),
10    tf.keras.layers.Dense(10, activation='relu'),
11    tf.keras.layers.Dense(10, activation='relu'),
12    tf.keras.layers.Dense(3, activation='softmax')
13])
14
15# Compile the model
16model.compile(optimizer='adam',
17              loss='categorical_crossentropy',
18              metrics=['accuracy'])

With the data loaded and the model defined and compiled, we are ready to explore the different types of callbacks.

Checkpointing Models

The first type of callback we'll explore is ModelCheckpoint. This callback can save the model state at the end of every epoch if desired, or when certain conditions are met (e.g., when the monitored metric improves), creating a checkpoint at various stages of training.

Here is an example of how ModelCheckpoint is used:

Python
1from tensorflow.keras.callbacks import ModelCheckpoint
2
3model_checkpoint = ModelCheckpoint(filepath='best_model.keras', save_best_only=True, monitor='val_accuracy')

The filepath parameter can be a string that contains a file path where the model checkpoint would be saved. A new model is saved to this path at the end of each epoch.

The save_best_only parameter, if set to True, would make the current model saved be replaced only if the monitor metric has improved, ensuring that only the best model is saved. If set to False, a new model will be saved in every epoch.

The monitor is essentially the metric that you want to observe. If save_best_only=True, then the current model is saved only if this metric improves.

Scheduler for Learning Rate

The second callback is LearningRateScheduler. This callback adjusts the learning rate according to a schedule that you define. Let's explore this through an example:

Python
1from tensorflow.keras.callbacks import LearningRateScheduler
2
3def scheduler(epoch, lr):
4    if epoch < 10:
5        return lr
6    else:
7        return lr * 0.1
8                
9learning_rate_scheduler = LearningRateScheduler(scheduler)

The LearningRateScheduler callback in Keras takes a function scheduler as an argument. This function should have two parameters: the current epoch (integer, indexed from 0) and the current learning rate, and should return the learning rate for the next epoch. LearningRateScheduler automatically passes the epoch number and current learning rate to the scheduler function during training.

In the above example, you can see that we are maintaining the learning rate for the first ten epochs. However, for epochs thereafter, the learning rate is reduced by 90%. This is known as learning rate decay and it's a common technique to ensure that the model converges.

Custom Callbacks

In addition to built-in callbacks, you can create your own callbacks. This offers a high level of flexibility to create and experiment with new strategies. The custom callback class looks for specific function names that correspond to different stages of the training process, one example being the on_epoch_end. Let's see how to create a custom callback, where we print the accuracy at the end of each epoch:

Python
1from tensorflow.keras.callbacks import Callback
2
3class CustomCallback(Callback):
4    def on_epoch_end(self, epoch, logs=None):
5        print(f"End of epoch {epoch + 1}. Accuracy: {logs['accuracy']:.4f}")
6               
7custom_callback = CustomCallback()

Here, we define a Callback subclass and then define the method on_epoch_end, which gets called at the end of every epoch. The method automatically receives two parameters: epoch, which is the current epoch number, and logs, which is a dictionary that contains the loss value along with all of the metrics we are using.

The output of this callback during model training will be:

Plain text
1End of epoch 1. Accuracy: 0.5200
2End of epoch 2. Accuracy: 0.5500
3End of epoch 3. Accuracy: 0.5400
4End of epoch 4. Accuracy: 0.5500
5End of epoch 5. Accuracy: 0.5400

This output demonstrates how the custom callback prints the accuracy at the end of each epoch, providing real-time feedback on the training performance of the model; you could also print other metrics or execute different custom actions.

Using Callbacks in Model

Now, we are ready to apply these three callbacks to our model during training. We pass a list of callbacks to our model.fit() function, like so:

Python
1model.fit(
2    X_train, 
3    y_train, 
4    validation_data=(X_test, y_test), 
5    epochs=50, 
6    callbacks=[model_checkpoint, learning_rate_scheduler, custom_callback],
7    verbose=0)

With this approach, all three callbacks will be used during training. They will save the model at the end of each epoch, adjust the learning rate according to the scheduler, and fire on our custom callback's method at the end of each epoch.

Lesson Summary and Next Steps

Excellent work! We've covered what callbacks are and how they can be used to customize and control the training process in TensorFlow. We've focused on how to use the ModelCheckpoint to save our model, how to use the LearningRateScheduler to adjust the learning rate, and how to create and use a custom callback to execute custom actions.

The ability to calibrate and control TensorFlow training processes using callbacks is a powerful skill. Now you're ready to take on the practice tasks where your understanding of callbacks and their implementations will be fortified. Through these tasks, your TensorFlow skills will be further refined. Keep going and happy learning!

Enjoy this lesson? Now it's time to practice with Cosmo!

Practice is how you turn knowledge into actual skills.