Deep Model Evaluation with PyTorch

Lesson 3

Introduction

Welcome back! Previously, we trained a multi-classification PyTorch model using our prepared Wine dataset. Now, let's delve into evaluating our model's performance. In this context, the goal is to discern how well our model generalizes from what it has learned during training to handle unfamiliar, unseen data. This lesson digs into loss functions, accuracy computation, and performance interpretation, all key components in the model evaluation process. Additionally, we will visualize the training and validation losses using matplotlib to gain better insights into the learning process and model performance.

Recap: Building and Training Our Model

Before diving into model evaluation, let's briefly recap the process of building and training our PyTorch model using the Wine dataset. Below is the code snippet from the previous lesson that sets up our model, defines the loss function and optimizer, and trains the model over 150 epochs.

Python
1import torch
2import torch.nn as nn
3import torch.optim as optim
4from data_preprocessing import load_preprocessed_data
5
6# Load preprocessed data
7X_train, X_test, y_train, y_test = load_preprocessed_data()
8
9# Define the model
10model = nn.Sequential(
11    nn.Linear(13, 10),
12    nn.ReLU(),
13    nn.Linear(10, 10),
14    nn.ReLU(),
15    nn.Linear(10, 3)
16)
17
18# Define criterion and optimizer
19criterion = nn.CrossEntropyLoss()
20optimizer = optim.Adam(model.parameters(), lr=0.001)
21
22# Train the model
23num_epochs = 150
24history = {'loss': [], 'val_loss': []}
25for epoch in range(num_epochs):
26    model.train()
27    optimizer.zero_grad()
28    outputs = model(X_train)
29    loss = criterion(outputs, y_train)
30    loss.backward()
31    optimizer.step()
32    history['loss'].append(loss.item())
33
34    model.eval()
35    with torch.no_grad():
36        outputs_val = model(X_test)
37        val_loss = criterion(outputs_val, y_test)
38        history['val_loss'].append(val_loss.item())

This snippet includes:

Loading the preprocessed data.
Defining a simple neural network model with 3 layers using nn.Sequential.
Choosing CrossEntropyLoss as the loss function and Adam as the optimizer.
Training the model for 150 epochs, recording both training and validation loss.

With the model trained, we are now in a position to evaluate its performance in a more detailed manner.

Predicting and Evaluating on the Test Set

During training, we calculated at each epoch the loss for both the training set and test set, used as validation, and stored these values in our history dictionary. This helped us monitor the model's performance and ensure it was not overfitting to the training data.

Now, to evaluate our fully trained model, we again calculate the test loss and compute the accuracy, which is the fraction of correct predictions over total predictions. We'll use torch.no_grad to disable gradient calculations, as they are not needed during evaluation, and the accuracy_score function from sklearn.metrics to compute accuracy.

Here's the code for performing these evaluations in PyTorch:

Python
1from sklearn.metrics import accuracy_score 
2
3# Set the model to evaluation mode
4model.eval()
5
6# Disables gradient calculation
7with torch.no_grad():
8    # Input the test data into the model
9    outputs = model(X_test)
10    # Calculate the Cross Entropy Loss
11    test_loss = criterion(outputs, y_test).item()
12    # Choose the class with the highest value as the predicted output
13    _, predicted = torch.max(outputs, 1)
14    # Calculate the accuracy
15    test_accuracy = accuracy_score(y_test, predicted)
16
17print(f'Test Accuracy: {test_accuracy:.4f}, Test Loss: {test_loss:.4f}')

In this code, we first set the model to evaluation mode using model.eval() to ensure it behaves differently during evaluation compared to training. Next, we disable gradient calculation with torch.no_grad() to save memory and computational resources. We then pass the test data through the model to generate the outputs. To evaluate the model's performance, we calculate the Cross Entropy Loss using these outputs.

We also obtain the predicted class labels by selecting the class with the highest value using torch.max(outputs, 1), which returns two tensors: the max value (which we discard using _ as it's not needed) and the index of the max value along dimension 1, which corresponds to the predicted class.

Finally, we compute the test accuracy by comparing the predicted labels with the true labels using the accuracy_score function from sklearn.metrics. The output values for test accuracy and test loss provide quantitative measures of our model's performance on unseen test data, offering insight into its generalization capability.

Plain text
1Test Accuracy: 0.9259, Test Loss: 0.4211

Visualizing Loss Data with Matplotlib

Visualizing the loss data during model evaluation is crucial as it helps in understanding the learning progress of our model over time. By plotting the training and validation loss, we can identify patterns such as overfitting or underfitting, providing valuable insights to fine-tune the model. Using matplotlib, a widely-used plotting library in Python, we will graph the loss history recorded during our training to visually assess the model's performance.

Here's how we plot the loss data:

Python
1import matplotlib.pyplot as plt
2
3# Plotting actual training and validation loss
4epochs = range(1, num_epochs + 1)
5train_loss = history['loss']
6val_loss = history['val_loss']
7
8plt.figure(figsize=(8, 5))
9plt.plot(epochs, train_loss, label='Training Loss')
10plt.plot(epochs, val_loss, label='Validation Loss')
11plt.title('Model Loss During Training')
12plt.ylabel('Loss')
13plt.xlabel('Epoch')
14plt.legend()
15plt.show()

The generated plot might look like this:

It is possible to see that both the training loss and validation loss decreased steadily, which means the model was learning well from the data. The lines are close together, so the model wasn’t just memorizing the training data; it was actually learning to generalize, which is great because it means it should perform well on new, similar tasks. The smooth decrease in loss also tells us that the training process was stable and overall, the model trained quite successfully.

Lesson Summary and Practice

In this lesson, we explored different aspects of model evaluation, including calculating loss, predicting labels, and determining accuracy. We also learned how to visualize our loss data using matplotlib to assess the training progress. Upcoming practice exercises will give you hands-on experience in applying these concepts. Remember, mastering PyTorch requires a balanced mix of understanding and application. Happy coding!

Additional Resources

For your convenience, here is the helpful code snippet for loading and preprocessing the Wine dataset, which you can use to ensure your data is properly prepared for training and evaluation in PyTorch:

Python
1import torch
2from sklearn.datasets import load_wine
3from sklearn.model_selection import train_test_split
4from sklearn.preprocessing import StandardScaler
5
6def load_preprocessed_data():
7    # Load the Wine dataset
8    wine = load_wine()
9    X, y = wine.data, wine.target
10
11    # Split the dataset into training and testing sets
12    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, stratify=y)
13
14    # Scale the features
15    scaler = StandardScaler().fit(X_train)
16    X_train_scaled = scaler.transform(X_train)
17    X_test_scaled = scaler.transform(X_test)
18
19    # Convert to PyTorch tensors
20    X_train_tensor = torch.tensor(X_train_scaled, dtype=torch.float32)
21    X_test_tensor = torch.tensor(X_test_scaled, dtype=torch.float32)
22    y_train_tensor = torch.tensor(y_train, dtype=torch.long)
23    y_test_tensor = torch.tensor(y_test, dtype=torch.long)
24
25    return X_train_tensor, X_test_tensor, y_train_tensor, y_test_tensor

Enjoy this lesson? Now it's time to practice with Cosmo!

Practice is how you turn knowledge into actual skills.