Hello! In today's lesson, we will be diving into evaluating models in PyTorch. Evaluating the performance of a model plays a key role in the process of building an effective machine learning model. It helps us to understand the ability of the model to generalize on unseen data. We will attain this by using a test dataset, making predictions using our trained model, and comparing these predictions with the actual truth values in the test dataset.
Before evaluating the model, let's quickly recap the training process:
Python1import torch 2import torch.nn as nn 3import torch.optim as optim 4 5# Training Features 6X_train = torch.tensor([ 7 [3.0, 0.5], [1.0, 1.0], [0.5, 2.0], [2.0, 1.5], 8 [3.5, 3.0], [2.0, 2.5], [1.5, 1.0], [0.5, 0.5], 9 [2.5, 0.8], [2.1, 2.0], [1.2, 0.5], [0.7, 1.5] 10], dtype=torch.float32) 11 12# Training Targets 13y_train = torch.tensor([[1], [0], [0], [1], [1], [0], [1], [0], [1], [0], [1], [0]], dtype=torch.float32) 14 15# Define the model using nn.Sequential 16model = nn.Sequential( 17 nn.Linear(2, 10), 18 nn.ReLU(), 19 nn.Linear(10, 1), 20 nn.Sigmoid() 21) 22 23# Define loss function and optimizer 24criterion = nn.BCELoss() 25optimizer = optim.Adam(model.parameters(), lr=0.01) 26 27# Train the model for 50 epochs 28for epoch in range(50): 29 model.train() # Set the model to training mode 30 optimizer.zero_grad() # Zero the gradients 31 outputs = model(X_train) # Compute predictions 32 loss = criterion(outputs, y_train) # Compute the loss 33 loss.backward() # Compute the gradient 34 optimizer.step() # Update the parameters
Here’s a summary of the steps:
X_train
and targets y_train
.nn.Sequential
with one hidden layer (ReLU) and an output layer (Sigmoid).BCELoss
) and the Adam optimizer.Now let's move to evaluating our model.
Before we evaluate our model, we need to prepare our test dataset. The test dataset consists of new data points that the model has never seen before. This helps us understand how well our model generalizes to unseen data.
Let's define our testing data with the same format our model was trained on in the previous lessons:
Python1import torch 2 3# Test Features 4X_test = torch.tensor([[2.5, 1.0], [0.8, 0.8], [1.0, 2.0], [3.0, 2.5]], dtype=torch.float32) 5# Test Targets 6y_test = torch.tensor([[1], [0], [0], [1]], dtype=torch.float32)
With our test data ready, we can now move on to evaluating our model's performance using these new examples.
Next, we introduce the concept of evaluation metrics. Evaluation metrics interpret the performance of our model. There are many metrics but we will consider accuracy for our lesson.
Accuracy is a useful measure when the target variable classes in the data are nearly balanced. It is defined as the number of correct predictions made divided by the total number of predictions made. Mathematically, it is represented as:
We can easily apply this using Scikit-Learn. The sklearn.metrics
module includes score functions, performance metrics, pairwise metrics and distance computations and we import the function accuracy_score
from this module.
Python1# Import the accuracy_score functionality 2from sklearn.metrics import accuracy_score
Once the model is trained, we switch it to evaluation mode using model.eval()
. Finally, it's time to evaluate our model. Here's an outline of the steps followed in the code:
Let's run our model evaluation code:
Python1# Set evaluation mode and disable gradient 2model.eval() 3with torch.no_grad(): 4 # Make Predictions 5 outputs = model(X_test) 6 # Convert to binary classes 7 predicted_classes = (outputs > 0.5).int() 8 # Calculate the loss on the test data 9 test_loss = criterion(outputs, y_test).item() 10 # Calculate the accuracy on the test data 11 test_accuracy = accuracy_score(y_test.numpy(), predicted_classes.numpy()) 12 13# Print the test accuracy and loss 14print(f'\nTest accuracy: {test_accuracy}, Test loss: {test_loss}')
The output of the code is the accuracy of the model on the test set and the loss on the test set:
Plain text1Test accuracy: 1.0, Test loss: 0.19731628894805908
This output implies that our model has perfectly classified all the test examples correctly, showcasing a 100% accuracy. The very low test loss indicates that the model's predictions are very close to the actual targets. However, it's important to note that while a perfect accuracy and low loss on the test set are desirable, they don't always guarantee that the model will perform equally well on entirely new data. Overfitting can sometimes cause high performance on the test set at the expense of generalizability. Therefore, it's essential to validate these results with a larger and more varied dataset in real-world applications.
Excellent! You now understand how to evaluate a model in PyTorch. Understanding model performance is core to building effective machine learning models. Going forward, these skills will be crucial for understanding and improving the performance of your models.
All that's left is to practice this skill. Up next, you'll find practice exercises designed to reinforce what you've learned today. Keep going - you're doing great!